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Abstract: Let £ L 2 (7~) be a stationary process with associated 

lag operators Ch- Uniform asymptotic expansions of the corresponding em¬ 
pirical eigenvalues and eigenfunctions are established under almost optimal 
conditions on the lag operators in terms of the eigenvalues (spectral gap). 
In addition, the underlying dependence assumptions are optimal in a cer¬ 
tain sense, including both short and long memory processes. This allows us 
to study the relative maximum deviation of the empirical eigenvalues un¬ 
der very general conditions. Among other things, convergence to an extreme 
value distribution is shown. We also discuss how the asymptotic expansions 
transfer to the long-run covariance operator Q in a general framework. 


1. Introduction 

Principal component analysis (PCA) has emerged as one of the most important 
tools in multivariate and highdimensional data analysis. In the latter, func¬ 
tional principal component analysis (FPCA) is becoming more and more im¬ 
portant. A comprehensive overview and some leading examples can be found 
in [38], [45], [61]. Given a functional time series X = {Xfc} fcgZ , it is typically 
assumed that X lies in the Hilbert space L 2 (T), where T C IR d is compact. The 
fundamental tool in the area of PCA and FPCA - both in theory and practice - 
is the usage of (functional) principal components (FPC). To fix ideas, let us in¬ 
troduce some notation. If X is stationary with E [||ATfc||J_ 2 ] < oo, then the mean 
/r = E[Xfc] and the covariance operator 


C(-) =E[(X fc -/v)(X fc -/z)] 


( 1 ) 


exist. Here (•,•} denotes the inner product in L 2 , and || • ||l 2 the corresponding 
norm. The eigenfunctions of C/, are called the functional principal components 
and denoted by e = {e^gM, he; we have C[e,'] = where A = {AjjjgjN 

denotes the eigenvalues. The eigenfunctions e are usually estimated by the em¬ 
pirical eigenfunctions e = {ejjjgiN, defined as the eigenfunctions of the empirical 
covariance operator 



t 


1 
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where X n = Hence C(ej) = Xjej, where A = {AjjjgiN denotes 

the empirical eigenvalues. Due to the fundamental importance of eigenfunctions 
and eigenvalues for FPCA and PCA, corresponding results on the asymptotic 
behavior of empirical eigenfunctions and values are of high interest. [1] was 
among the first to give such results, (see also [22]), and established a CLT for 
A j (resp. ej) if j is fixed. Fueled from highdimensional applications, uniform 
bounds where j increases with the sample size n have become very important, 
leading to a significant rise in complexity of the problem. Well-known pathwise 
bounds are provided in the Lemma given below (cf. [8], [11]). 

Lemma 1.1. //Xs L 2 (T) and E[||X fc ||^ 2 ] < oo, then 

|Aj — Aj | < ||C - C\\ c , ||ej - ej\\ h3 < — \\C - C|| £ , 

where if)j = minjAj-i — Xj,Xj — Aj+i} (with ipi = Xi — X 2 ) and || • ||£ denotes 
the operator norm. 

Remark 1.2. Strictly speaking, we consider the difference ej — Cjej : where 
cj = sign((e"j, ej}). Since c.j is unidentifiable, we assume without loss of generality 
that throughout the remaining sequel Cj = 1, which is the common approach in 
the literature. 

The attractiveness of the above bounds lies in their simplicity, but unfor¬ 
tunately they are far from optimal from a probabilistic perspective. Indeed, 
the results of [22] tell us that in case of A j — A j, the correct bound should in¬ 
clude the additional factor A j, i.e; \j\\C — C\\c- A similar claim can be made for 
11e)’ — e^'11£ 2 . In this spirit, based on Lemma 1.1, asymptotic expansions for Xj — Xj 
and e"j — ej which allow for increasing j have been established in [29], [30], [31] 
(see also [11], [16], [52]). These results have proved to be an indispensable tool 
in the literature, see for instance [9], [15], [16], [29], [38], [45], [53] to name a few. 
But the corresponding (asymptotic) analysis is often based on heavy structural 
assumptions regarding X and the spacings (spectral gap) \F = n of the 

eigenvalues, limiting its applicability. In particular, often only the covariance 
operator C is considered, and a common key assumption is that X is an IID 
sequence, which is rather restrictive, see [35], [38], [58] and also Sections 2.2 and 
6.2. In the presence of serial correlation, the lag operators Ch and the long-run 
covariance operator Q , formally defined as 

c h (-)=E[(x k -p,-)(x k _ h -ji)\, e(-) = E c 4-)> ( 3 ) 

hew, 

serve as a generalization of C = Cq. They play a fundamental role for depen¬ 
dent functional time series, see for instance [32], [57], [58]. In this paper, we 
consider a general framework that contains both Ch and Q , avoiding the previ¬ 
ously mentioned limitations. We derive exact asymptotic expansions of A j, ej 
under optimal dependence assumptions, allowing for short memory (weak de¬ 
pendence), but also for long memory (strong dependence) in case of Ch, h finite. 
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In addition, we only require a ’natural condition’ concerning the spectral gap 
H/. It turns out that this condition is nearly optimal. 

As a particular application, we study the relative maximum deviation of the 
empirical eigenvalues of C, namely 


T + = \fn max ——— 
1 <3<Ji 


where J+ —>■ oo, see Proposition 2.4 for a precise definition of J+. Under mild 
assumptions, we show that 


a n {T Jt - b n ) A V 


( 4 ) 


where V is a distribution of Gumbel type. The latter is based on a high dimen¬ 
sional Gaussian approximation, which is of independent interest, see Theorem 
10.2. Result (4) is particularly important for the construction of simultaneous 
confidence sets and tests for the relevant number of FPCs to be used for statis¬ 
tical inference or modelling (cf. [5], [45], [61]). The range of further applications 
is surveyed in Section 6. Here we also touch on the possibility of long-memory 
in functional time series. 

An outline of the paper can be given as follows. In Section 2 the key ex¬ 
pansions of A j and e) are established in a general framework, alongside some 
additional results. In particular, we discuss in detail the optimality of the un¬ 
derlying assumptions. Asymptotic expansions of A j and e) in the context of Ch 
and Q are established in Sections 3 and 4, whereas Section 5 is devoted to the 
study of (4). Additional fields of application are surveyed in Section 6, with an 
emphasis on functional linear regression, ARH(l) processes and long-memory 
in a functional context. The proofs of the eigen expansions are given in Sections 
7, 8 and 9. In Section 10.1, a general high dimensional Gaussian approximation 
under dependence is established. Based on this result, we prove (4) in Section 
10.2. Finally, Section 11 presents the proofs of Section 6. 

2. Preliminary notation and main asymptotic expansions 

Forp > 1, denote with || • || p the L p -norm E[|-| p ] 1 / p . We write <, >, (~) to denote 
(two-sided) inequalities involving a multiplicative constant, a A b = min{a, 6} 
and a\/b = max{a, 6}. Given a set A , we denote with A c its complement. More¬ 
over, we write X = X — E [A] for a random variable X. 

In the sequel, it is convenient to first consider a more abstract framework. 
Assume that the operator T> : L 2 (T) H > L 2 (T) has non-negative eigenvalues 
A = {AjIjgiN and eigenfunctions e = {ej}jg]N, and satisfies the spectral repre¬ 
sentation 


OO 


OO 



( 5 ) 
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For a sequence of non-negative numbers {AjJyeM with Eyli -\j < 00 and rea l- 
valued random variables {ijfjijeiN, {vVjJije in consider the empirical version 




A * A i {vZ - vZ) ( e i >-) e h with ^(ej) = A jej, 3 e IN, 


*> 3=1 


00 

where we demand !?(■)— ^KXMvZ\(e^)e r 

i,j~ 1 


( 6 ) 


The random variables 77 ® denote the contributing random components, whereas 
77 ^ denote the negligible parts. In the sequel, both random variables depend 
on a sequence m —> 00 , i.e; 77 ^ = 77 (m) and r/Jj = To simplify the 

notation, we often suppress this dependence if it is of no immanent relevance. 
This class of (empirical) operators is rich enough to include the lag operators Ch 
(in fact only C* h Ch, see Section 3), but also the more general long-run covariance 
operator Q (see Section 4). In order to provide an intuition for this setup, let us 
discuss how this translates in case of the covariance operator C, hence T> = C 
and TX = C. Then obviously A j = A j and for m = n we have 


r ifA n ) = Y 


Vk^iVkJ 


dZM) = Y 


Vk,iVl,j 


Vk,j — 


(Xk, Cj) 


fc=l 


k,l =1 


A 


1/2 


(7) 


Clearly, if X is stationary, then so is {rik,j}kE'z,jett and hence C does not depend 
on n in this case. We also note that £[ 77 ^-] = 1 and £[ 77 ^-] = 0 for * ^ j 
since E [r)k,iVk,j\ = 0 by the classical Kahunen-Loeve expansion (cf. [38]). This 
is actually true in a more general fashion. Since e are the eigenfunctions of T> , 
the two representations given in (5) and (6) yield that (XiXj ) 1//2 1E [ 77 ^,-] = 0 
for i 7 ^ j. For the sake of reference, we formulate this simple observation as a 
lemma. 


Lemma 2.1. Assume TX satisfies (5) and ( 6 ) with eigenvalues X and eigen¬ 
functions e. Then (AjAj) 1 / 2 ®^^] = 0 for i j and Xj = . 

Most of our results in the sequel depend on the centered version of 77 ® , i.e; 


vZ = vZ ~ E [nZj] > b j e w. 


We now demand the following conditions. 

Assumption 2.2. The operators X>, T> satisfy (5) and (6). Moreover, for a 
universal constant C 73 and a universal sequence s® = o(l) and a > 0, p,p > 1, 
Jfi, € IN and to — > 00 it holds that 


(Dl) to 2 maxj i j gif 11 77^ (to) 11 < CP and to 2 max^-gf* || 77^- (m) || < s 
for q = p2P+ 4 , p = [p/a], 


T> 


(D2) max ]li < i/ + 




| A ■ —A,-1 ’ 


-> — l+2a 


E oo 
*=! 


A, A, 




< c 


T> 


> m ^/C 7 *, 
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(D3) 1 /C® < E [rjJ’j (to)] < C® for j e N and — C®. 

Remark 2.3. Note that in the above assumptions, A may depend on m. We 
can deal with this case in the sequel due to the universal bounds provided by 

C®. 

Let us discuss these assumptions and compare them to the literature. As 
a general preliminary remark, we note that all of our results have analogues 
in a general Hilbert space setting H. Working in L 2 (T) is notationally less 
burdensome though, and the proofs are simpler. In particular, the Fubini-Tonelli 
Theorem allows to interchange the order of inner products and expectations. 
Since most related relevant results in the literature focus on the covariance 
operator C, we also consider this setup for our discussion, i.e; T> = C (and 
V = C). To this end, it is convenient to translate Assumption 2.2 to this special 
case to make the comparison transparent. Recall the notation introduced in (7). 
We then have the following result. 

Proposition 2.4. Let X be stationary with E[||Xfe||^ 2 ] < C c for a universal 
constant C c . Then C satisfies (5) and (6) with summable eigenvalues A and 
eigenfunctions e. Assume in addition that for some a > 0, t),p > 1 and universal 
sequence s„ = e>(l) we have that 

(Cl) n? maxjjgfj \\vfj ( n ) \ \ q < C c , n* max jeK ||X]£ =1 T] k ,j\\ 2q < s% 
for q = p2 p+4 , p = \\]/a\, 

(C2) (D2) holds with C x ’ = C c , m = n, J+ £ IN and a as above. 

Then Assumption 2.2 holds for T> = C with a > 0, fpp > 1, m = n, J+ £ IN, 
s m — s n an d C 1 * = C C as above. 

Let us now compare the literature with Proposition 2.4. 

Dependence assumptions: Assumption (Cl) implicitly imposes a dependence 
assumption on the scores rjk.j- In contrast to the literature (cf. [21] [30] [31], [52]), 
we do not require the typical independence assumption. In fact, (Cl) is much 
more general. In Section 2.2 we also discuss why looking at C under dependence 
can be relevant in practice. It can be shown that (Cl) holds under general, 
sharp weak dependence conditions. This means that if these conditions fail, we 
no longer have weak dependence. However, much more is valid. Suppose that 
Vk,j = Y^iLo a i,j e k-i,j where { £ k,j} keZ - g]N is standard Gaussian and IID and 
a i,j ~ a > 1/2- Then we show in Section 2.2 that 

|| ||C — C||l 2 || < rC 1//2 is equivalent with ’(Cl) holds for any fixed p > V, 

( 8 ) 


where ||C — C||l 2 denotes the Hilbert-Schmidt-norm. Hence the rate n -1 / 2 car¬ 
ries over and (Cl) poses no restriction, as long as we consider the CLT-domain 
(normalization with n -1 / 2 ). In this sense, condition (Cl) is optimal (in the CLT- 
Domain). Interestingly, this also allows for long memory sequences, and we even 
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obtain a CLT for A j and e'j under long memory conditions, i.e; where Ojj = 

oo, see Theorem 2.9. Note that it is shown in [54] that \oti\ < oo is necessary 
for the validity of a CLT for X & in an infinite dimensional Hilbert space (a 

different normalization doesn’t help here, which is different from the univariate 
case, see [54] for details). Note that condition maxjgi ||n -3 / 4 X)]! =1 rjk,j \\ 2 q = 
o(l) is usually for ’free’ due to the additional factor n -1 / 4 , and is only neces¬ 
sary to control the empirical mean correction X n . Finally, we remark that our 
method of proof can also be used to derive corresponding results in the non¬ 
central domain, i.e; where ||||C — C||jl,2|| ~ b n with ^Jn = e>(6 n ). To keep this 

exposition at reasonable length, this is not pursued here. 


Structural conditions for eigenvalues : (C2) is the key condition regarding the 
structure of the eigenvalues A j. Note that the special form of the terms appear¬ 
ing in (C2) is no coincidence, and is connected to the variance of the asymp¬ 
totic distribution of the empirical eigenfunctions e"j (cf. [22]). The literature 
(cf. [16], [21], [29], [30], [31]) usually requires polynomial, exponential or convex 
structures regarding the decay-rate of the eigenvalues and particularly the spac¬ 
ing i/jj. For instance, a common minimum assumption is that ifj > A jj~ x , which 
reflects a polynomial behavior of the eigenvalues A j. As will be discussed below 
Theorem 2.6, (C2) turns out to be much weaker, in fact, we shall see that it is 
nearly optimal. To get a feeling of the implications of (C2), let us consider the 
case where A j satisfies a convexity condition, i.e; 

the function A(x) : x K > \ x is convex. (9) 


If (9) holds, then one may verify (cf. Lemma 7.13) that 


E 

i=i 


A, 

I A? - Ail 


< j log j 


and 


E 

2=1 


XiXj 


(A, - A,) 2 


<f, 


( 10 ) 


hence (C2) is valid if J 4 " < n 1//2 “(logn) 1 . Note that these bounds are not 
directly influenced by the decay of A or ’S'. The convexity condition (9) itself is 
mild and includes many cases encountered in the literature (cf. [21]), in partic¬ 
ular polynomial or exponential cases 


Aj ~ j z P • 7 , 0 < p < 1, |r| <oo or A j ~ j r , r > 1. (EP) 

Also note that (C2) implies that the first J+ eigenvalues are distinct. See [22] 
for a flavour of results which allow for eigenspaces with rank greater than one. 


Moment assumptions : The existence of all moments (often with additional Gaus¬ 
sian like growth conditions) is usually required in the literature (cf. [21] [30] [31], [52]) 
in the context of expansions for A^, e'j. In contrast, we only require a finite num¬ 
ber of moments, which, however, may be large. On the other hand, all of our 
results will be expressed in terms of the || • || p -norm, and moving over to the 
weaker dp(-) formulation, the moment assumptions can be lowered. 
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For stating our results, we introduce the quantity 

J^ = ((©-©)( ei ), ei ), i,j€ M, (11) 


which is one of the main contributing parts in the expansions given below. 
We first give the main results, followed by a discussion and comparison to the 
literature. For the empirical eigenvalues A j, we have the following. 

Theorem 2.5. Assume that Assumption 2.2 holds. Then for 1 < J < J+ 


max 

i <j<J 



J 1 / p m~ a 

y/rn 


The above result provides an exact uniform first-order expansion for A j. For 
a nonuniform version, the factor J 1/,p in the bound on the RHS can be dropped. 
Next, we state the companion result for the empirical eigenfunctions ej. 

Theorem 2.6. Assume that Assumption 2.2 holds. Then for 1 < J < J+ 


max 
i <j<J 


1 e 3 n- 

: ( e 7 - e i + —\\ e 3 - ' 


V 


'3 IIl 2 


oo 


k= 1 



L 2 


< 


J x ! p m- a 


where A j = YX= i 


At A If 

(Aj-AO 5 *' 


and we also have 


max 

i <j<J 


1 

A? 



OO 


J 2 

1 k,j 


( A i " 


< 




TO 


Theorem 2.6 provides both uniform expansions for e) and the corresponding 
norm. As before, the factor J 1 / p in the bound on the RHS can be dropped for 
a nonuniform version. We also have a slight modification of Theorems 2.5 and 
2 . 6 . 

Proposition 2.7. Assume that Assumption 2.2 holds. Then for 1 < J < ,/+, 
one may replace {/fcjjfceJM with {(AfcAj^/^j^Ajfcew m Theorems 2.5 and 2.6. 
Recall also that A j = Aj/E^J^] by Lemma 2.1. 

As an immediate corollary, we obtain a probabilistic version of Lemma 1.1 of 
correct order. 

Corollary 2.8. Assume that Assumption 2.2 holds. Then for 1 < j < J+ 
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2.1. Previous results and comparison 

Let us now compare Theorems 2.5 and 2.6 to the literature in case of T> = C. 
It seems that the currently best known expansions in this context can be found 
in [31]. Among other things, it is required that {^fe} fcgZ is HD, all moments 

exist, and the error term ERj+ in the expansions of A j — Xj (not weighted with 
A” 1 ) is of magnitude 

ER T += max n - 3 / 2 (i _ t.)- 1 / 2 ^,- 3 A7 1 / 2 s ^ s ■ = sup |e,(t)|, (12) 

" i <j<Ji ter 

and £ (0,1) is defined as = inffc <;/ (l — jf). We emphasize that this is 
the overall error term, hence one requires for instance at least ^JnERj+ = o( 1) 
for the validity of a CLT, and {n/X 2 J+ ) 1 ^ 2 ER J + = e>( 1) for a weighted version. 
If we assume the convexity condition (9), we see that (C2) is much weaker. 
In fact, takeing for instance A j ~ j~ c we find that ERj+ > n _3 / 2 (J+) 3+7c / 2 . 
On the other hand, we see from (10) that if J+ ~ n 1 / 2-0 , a > 0, we still 
obtain valid asymptotic expansions, i.e; the expressions containing Rj are still 
the principal terms in our expansions, reflecting the exact asymptotic behavior. 
In stark contrast, ERj+ already explodes for a small (resp. c large) enough, 
rendering a vacuous result. Similarly, (C2) is valid if we only require 

max n -1 / 2 /ifj < n~ a for some arbitrary a > 0, (13) 

i <j<Ji 

and again obtain valid asymptotic expansions. On the other hand, the actual 
approximation error ERj+ in [31] may even be unbounded, since 1/A j —> oo as 
j increases. In this sense, Assumption 2.2 is substantially weaker. 

2.2. Dependence assumptions: optimality 

Throughout this section, we assume that T> = C. We first present the following 
result. 

Theorem 2.9. Assume that X has zero mean such that for a > 3/4 

OO 

PkJ = E aijek-i,j, 0 < ctij ~ i “ and ekj are standard Gaussian IID. 

*=o 

Then (Cl) holds. Moreover, if we have in addition (C2) (for Jif possibly finite), 
then for any fixed 1 < j < J+ 

y/n(X 3 — X j) A- AT(0, X 2 j(r 2 Xj ) and \fn((e 3 - e 3 ) Af(0, X2 ej ), 

where denotes weak convergence in the corresponding (Hilbert) space, and 
<t\. (Tj ej ) denotes the corresponding variance (operator). 
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The above result indicates that a = 3/4 is the boundary value for a CLT 
with normalization y/n, see also the discussion in [3], [12]. In fact, given the 
linear structure of i)k,j one readily computes that 

|| ||C - C|| L2 1| 2 < rz- 1 / 2 iff a >3/4. 

On the other hand, Lemma 7.4 below yields that (Cl) implies || \\C — C\\i? || < 
n -1 ' 2 . Hence we obtain the equivalence in (8). Finally, note that the regime 
1/2 < a < 1 is generally considered as long memory. Hence by Theorem 2.9 
above, we obtain a CLT for A j and ej even in the presence of long memory, where 
3/4 < a < 1. If 1/2 < a < 3/4, Non-central limit theorems arise. If a < 1/2, 
then E[||Xo|| 2 2 ] = oo, which requires a completely different treatment. 


2.3. Spectral gap: almost optimality 


Next, we discuss the issue of ’almost optimality’ of condition (C2). To this end, 
we draw heavily from the noteworthy results of [52]. Suppose that {'dcj}, 

are IID and satisfy E [|?7i,.j| 2? ’] < p!C p_1 for some constant C > 0. If a structure 
condition like (EP) holds, then it is shown in [52] that 


E [ll e l ~ e 3 III, 2 ] 


< 


j 2 (logn) 2 


(14) 


As can be seen from Corollary 2.8, this bound deviates from the optimal one by 

the additional factor (logn) 2 . On the other hand, note that in the polynomial 

case in (EP), this bound is also valid for j > J+ (we require a > 0), which is 

■ 2 

a slightly larger region. In [52], a lower bound is also provided, which is — A 1. 
Strictly speaking, it is proven for the projection iTj = ej g) ej, where g denotes 
the one-rank operation 


v(w) = {u,w)v, u, v, w £ L 2 (T). 


According to [52], it then holds that (recall that C denotes the operator norm) 


A 1 < E[ 


'3 ll£j 


< j 2 (log n) 2 a l 


(15) 


Heuristically, this may also be inferred from [22]. On the other hand, Corollary 
2.8 and elementary computations yield 


e [IIAj - Till 


< 


1 AjAfc 

n _ Afe 


< 


if j < n 1//2 “(logn) - 


(16) 


(in the polynomial case) and thus the order of the upper and lower bounds match 
for j < n 1 / 2 ”“(logn) _1 . If j > n 1 / 2 , Cauchy-Schwarz yields the trivial optimal 
upper bound. Since a > 0 may be chosen arbitrarily small given sufficiently many 
(all) moments, we find that our conditions on the eigenvalues A are essentially 
optimal. In other words, we obtain exact expansions and the optimal error bound 
for almost the complete region of indices j where (16) still converges to zero. 
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3. Lag operator 

While the covariance operator C is a key object for serially uncorrelated data 
X, the lag operator Ch and the long-run covariance operator Q become more 
relevant in the presence of serial correlation. We focus on Ch in this section, and 
then carry out a similar program in Section 4 for Q. To facilitate the discussion, 
let us first introduce a popular notion of weak dependence. In the remainder of 
this section, we assume that for each j e IN, the score sequence {vk,j} keZ is a 
causal weak Bernoulli sequence, which can be written as 

Vk,j = gj(- ■ ■ 1 e k—l,j > e k,j) (17) 

for some measurable functions gj and IID sequences {e k }ke z with e k = {e k j } . £]N 
We do not specify any crosswise dependence between e k ,i, e k ,j for i ^ j, allowing 
for a large flexibility. Let £ k j = (ei,j, i < k)- To quantify the dependence of 
we adopt the coupling idea. Let {4,j}fc € zje m be an IID CC W of 

i ek ’j}kez,je m and £ k,i = ■ ■ -> e k,j) the coupled version of £ kJ . 

Then we define 

n p (k) = max||?7 fej - - Vk,j\\ p for P > !> where Vk,j = 9j { £ 'k,j)- ( 18 ) 

Roughly speaking, £l p (k) measures the overall degree of dependence of r]k,j = 
9j( £ k,j) on 6 q • and it is directly related to the data-generating mechanism of the 
underlying process ( [62] refers to f2 p (fc) as physical dependence measure). This 
dependence concept is well established in the literature, and popular processes 
like AR.MA, GARCH, iterated random functions etc. fit into this framework 
(cf. [62], [63]). Consider for example the linear process r/kj = Y^iZo a l e k-l,j 
where {tkj } kieZ , ieK is IID wit h ||e fe j|| < oo. Then 

oo OO 

^rip(fc) < oo holds iff |ajfe| < oo. (19) 

fc —1 k —1 

In this sense, (19) is necessary for a CLT. In fact, if it is violated, one can 
construct examples such that 


lim — 

n —>oo 77, 

and a different normalization than n" 1 / 2 is required (cf. [59]). In the sequel, all 
dependence conditions will be expressed in terms of summability conditions of 

A major difference when dealing with Ch compared to C (and Q ) is that it 
only satisfies a singular-value decomposition (SVD) in general, i.e; there exist 
orthonormal Bases e = {ej}jg]N, f = {/jljeu and a sequence of real numbers 




= oo, j G IN, 
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A = tending to zero such that for fixed h E Z 

oo 

C h (-)=E[(X kr )X k - h ]=J2V^(ej,-)fj, ifE[||X fe ||y <oo. (20) 

3 =1 

Hence a priori, Ch does not fit into our framework. However, by considering the 
symmetrized version T>(-) = C k Ch(-), we end up with an operator that meets 
our requirements. Here, C* h denotes the adjoint operator of Ch, given by 

OO 

C* h (-) =E[(X k _ h ,-)X k ] =^y/A“</,v>e,-. (21) 

3 = 1 

Routine computations (with X k = YfjLi then indeed reveal that 

OO OO / OO X 

'D(-) = 'y ] Xj(ej, -)ej = y ]y ' AfclE [?7^.,fc'?7o,y] j ( e j, ‘)ej. (22) 

j=i t=i ' fe= i ' 


Hence 2 ? has a spectral decomposition with eigenvalues A and eigenfunctions e 
and satisfies (5). Representations (20), (21) motivate a natural plug-in estimator 
for T> (cf. [11]), given as (for h £ IN) 


v(.) 


(n — h) 2 


E 

Kk,l<n—h 


(X l+h - X n , X k+h - X n )(X k - X n , ■) (X t 


x n ). 

(23) 


The empirical SVD components A = {AjjjgM, e = {ej}jg]N and f = {/j}jgiN 
are then defined via 

®( e l) = ^j e ji Ch(e.j) = xf fj , (24) 

where the empirical lag operator Ch is given by 
1 n 

c h (-) = ^ Y. ( x k-X n ,-)(X k _ h -X n ), 0 < h < n — 1, (25) 


and analogously for — n + 1 < h < 0. In order to apply Theorems 2.5 and 2.6 to 
A and e, the key objective is to validate (Dl) for appropriate 77® and 77^. To 
this end, introduce 

-^■l,h,r,i,j — (jll+h^rVlji fE[77i-(-/i,r^7i,z])® £ IN. 

Recalling X k = 2 we then define 77 ^ for hxed h £ IN as 


^ n—h oo 

} j '(jl) — “ ^ ^ A r{^A-i,h,r,i,j H - -^l,h,r,j,i) H” ^ ^ A r E ^ [^ 7 ^,r^ 70 ,j>] • 

r =1 

(26) 


Z=1 r—1 


Note that this automatically defines r/Jj via ( 6 ), see also (81) in the proof. We 
then have the following result. 
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Proposition 3.1. Let q > 2 and assume E[|| Xfc||{* 2 ] < oo and Cl^k) < k b 
b > 3/2. Then 'D = C* h Ch and T> as in (23) satisfy (5) and (6) such that 


1/2 II —T> 

n ' max n. 
i,j€ IN 


hJ II q 


< OO, 


n 1 / 2 max I 
i,j€ M 


v%\ 


< nr 1 / 2 . 


Related results can be established under different weak dependence condi¬ 
tions, see for instance [23] or [60]. Using Proposition 3.1, it is now easy to 
transfer the results, which we summarize in the following theorem. 

Theorem 3.2. Suppose that E[||Xfc||^ 3 ] < C Ch for a universal constant C Ch . 
Assume in addition that for some a > 0, t),p > 1 we have that 

(C h l) fi 4g (fc) < k~ b , b > 3/2 for q = p2 v+4 , p = [f)/a], 

(C h 2) (D2) holds with C” = C Ch , m = n, Jff G IN and a as above, 

(C h 3) 0 < inf ie]N A r E [r]h,rVoj]" • 

Then Assumption 2.2 holds for T> = C* h Ch and T> as in (23) with a > 0, f ),p > 1, 
m = n, Jff G IN, sZ = sZ = n ~ 1 / 2 and C ,x> = C Ch as above. In particular, 
Theorems 2.5 and 2.6 apply to A and e. 

It remains to deal with f, which is the subject of Theorem 3.3 below. 


Theorem 3.3. Grant the assumptions of Theorem 3.2, and let 1 < p r < p. 
Then 


fj ~ /, 



_ (Aj — A j)fj _ Ch(ej e j) + {Ch Ch)(ej) 

2 A j -fXj 

(llllej -ei|| L 2 || v + ^)- 


L 2 


P' 


As the proof shows, Theorem 3.3 is essentially a concatenation of the previous 
results. Note in particular that the above expansion can be developed further 
in a straightforward manner by employing Theorems 2.5 and 2.6. 


4. Long-run covariance operator 

The long-run covariance operator is a natural generalization of the covariance 
operator in the presence of serial correlation. From a statistical perspective, 
this is particularly relevant in the context of the CLT, where under appropriate 
conditions on X, we have that 


1 I n _ 

—=S n = —='SfX k ^N'(f) 1 G) and supi 

v n k=1 n 


- 1/21 


| 11 Sin 11L 2 11 2 < OO, 

where £?(•) is the long-run covariance operator, (formally) defined as 


g(-) = ]T c h (-), c fc (-) - E[(x k , -)x k _ h \. 


(27) 
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Note that Q in general only exists if ||C/i|U < oo, which is usually referred 

to as a weak dependence condition. In view of (27), we see that Q takes over 
the role of C if X has serial correlation: in the ’limit case’ where n~ Y ^ 2 S n is dis¬ 
tributed as A/”(0,<5), the best (in L 2 -sense) finite dimensional approximations 
are provided by the classical Kahunen-Loeve decomposition with respect to Q. 
Hence we can expect that for large enough n, finite dimensional approximations 
of n -1 / 2 ^ based on appropriate estimates Q are close to optimality too. We 
refer to [32], [39], [57], [58], and more recently [18] for further discussions. A uni¬ 
fying, even more general object than Q is the spectral density operator JF((9), 
first studied in [58], which recently has attracted a lot of attention (cf. [32], [57]). 
A (detailed) study is beyond the scope of the present note, and is left open for 
future research. It appears though that at least some of the results can be trans¬ 
ferred. 

Estimation of Q is a delicate issue, and already in the univariate/multivariate 
case a substantial body of literature has evolved around this problem, see for 
instance [2], [33], [64] and the many references therein. In the context of func¬ 
tional data, we refer for instance to [32], [39], [57], [58]. The basic principle is 
plug-in estimation, which leads to the estimates 

b 

e‘0=£»(■)+£ Uh(Ch(-) + £_&(•)), where C h (-) is as in (25), (28) 

h=l 

and | ujh\ < 1 is a sequence of weight functions. In the sequel, the choice of uh 
has little impact on the results, and we therefore set ojh = 1 for the remainder 

of this section. For consistent estimates, it is necessary that b = b n —> oo as 

^b 

n increases. Even so, in contrast to Ch , the estimate Q is biased. Depending 
on the decay rate of ||C/j||£, the optimal choice of b n is b n ~ logn (geometric 
decay), or b n ~ (polynomial decay with s), see [2]. Thus, the actual 

operator we are estimating is 


(•)• ( 2 9) 

\h\<b 

b r ^b 

Note that in general E [Q } ^ Q and hence Q is still biased, but this bias is 
negligible. We point out that subject to some regularity conditions (cf. [58]) 

nils’-<5 6 M| 2 ~ VWb, (30) 

which is the same rate as in the univariate case (cf. [2]). Moreover, under quite 
general assumptions (cf. [32], [58]), it follows that Q b satisfies the spectral de¬ 
composition 


S b (-) = 


OO 


j=1 


E< °°> 


i=i 


(31) 
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with eigenvalues A b = {and eigenfunctions e b = {e b } je ]N. Since the 
actual underlying operator of interest is Q b , it is natural to (first) express our 
conditions in terms of A 6 and e b . We can decompose X k as 





X b = E[<A-*,e$) 2 ], rfcj = {X k , ej )(X 5)" 1 / 2 . 


(32) 


Observe that in general E[ Vkj' l lki\ 7 ^ 0 for i / j, which is different from the 
Kahunen-Loeve expansion. In analogy to (7), we also introduce the quantity 


*1 3 


b t \ V'' ^k^kj I ^k-h^kj 

^» = E^ + E E- n-j» 

k= 1 h=lk=h+l 


(33) 


It is then easy to see that 

OO _ 

q\) = E -) e r ^ 6 (-) 

*d=i 


00 n=n=~ 

V A‘ A * E Wj]< e !.-><4 

(34) 


for appropriate (degenerate) random variables (ry^lijeiN (see (93)). Takeing 
(31) into account, we see that both (31), (34) match the setup in (5) and ( 6 ). 
We can thus appeal to the results of Section 2. To this end, it is convenient to 
denote with 


(35) 


eh = E [nij] = E ojL e F - 

\h\<b 

Note that by Lemma 2.1 we have for b G IN (including b = 00 ) 
r b i,j =° if * 7 t 3 and X b = c p b jtj X b . 

Let us now translate Assumption 2.2 to our present setup. 

Assumption 4.1. The sequence X is stationary such that 'Yhhe'z ||C/i||£ < 00 . 
Moreover, for b = o(n) : a universal constant C g < 00 and universal sequence 
s g = o(l) and a > 0, f ),p > 1 and J+ G IN it holds that 

(Gl) b (n/b )3 maxqj11 rj b j (n)11 < C G , n~Ui max ieW ||ELi - s ™ for 


q =p2P+ 4 , p = |~f)/a], 


{n/b)-? +a YA= 


SiSRfi’ 


(n/ 6 )“ 


l ^-^OO 
2^= 


2— 1 / \ b 




< C g and 


(G2)° max^^j-f • 

> (n/b)~^, 

(G3) b l/C g < ip b j}j < C g for j G IN, A b < C g . 

Let us discuss these conditions. In view of (30), the choice m = n/b is quite 
natural. Condition (Gl) b is a little more explicit than (Dl), but of the same 
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nature. (G2) h , (G3) b are essentially translations of (D2), (D3). Note that in 
the present formulation, (G3) b reflects the common non-degeneracy assumption 
encountered in the time series literature. 

The setup in Assumption 4.1 is quite general. Before looking at the possible 
range of applications, let us formulate the transferred results. To this end, in 
analogy to Ijj in (11), we introduce /£ • as 

I ij = (@ b ~0 b ) $)>$)’ M' eIN - (36) 

We then have the following general transfer result. 

Theorem 4.2. A ssume that Assumption 1^.1 holds. Then for 1 < J < Jff, 
Theorem 2.5 and Theorem 2.6 remain valid if we substitute n/b, A b , e b , A b , 
e* and / b ■ at the corresponding places. Moreover, corresponding versions of 
Proposition 2.1 and Corollary 2.8 hold. 

Due to the uniform bounds provided by C e in Assumption 4.1, Theorem 4.2 
can either be used pointwise (for arbitrary but fixed b,n £ IN), or uniformly in 
b, n, depending on whether Assumption 4.1 holds pointwise or uniformly. The 
strength and weakness of Theorem 4.2 is that everything is essentially expressed 
in terms of the operator Q b . The positive aspect is that this makes the assump¬ 
tions rather general (in fact, almost optimal in a certain sense, see below). On 
the other hand, the drawback is that these conditions can be difficult to verify, 
since they explicitly depend on b. If b = b n is a function in n this is not so 
useful, and one would be more interested in uniform bounds in terms of n. Let 
us mention here that the trouble mainly originates from (G2) b and not (Gl) b . 
It is therefore desirable to find simple conditions that depend in a more trans¬ 
parent way on b, and preferably mainly on Q . More precisely, the aim is to find 
simple, sufficient conditions that imply a uniform validity of Assumption 4.1. 
Before turning to this issue, let us first discuss an interesting case where the 
problem mentioned above does not occur. 

m-correlated processes: We call X an m-correlated process if Ch = 0 for 
\h\ > m, where m is finite. Locally dependent processes are quite common in the 
literature, and often modeled as m-dependent processes. Clearly, m-dependency 
implies m-correlation. Moreover, we get that 

G b =J2 C h= Ch = gm = g ' ifm<6. 

\h\<b |/i| <m 

Note that m-correlation also implies that representations (31) and (34) are valid. 
Hence we conclude the following. 

Corollary 4.3. //X is m-correlated and m < b, then we can replace e b , rf kj 
with ej 1 , rfffj everywhere in (32) and (33) (which alters (Gl) b j, and b with m 
everywhere in (G2) b and (G3) b . 
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Corollary 4.3 shows that Theorem 4.2 applies to a large class of processes 
under general and accessible conditions. Note in particular, that the optimality 
criterium used in Section 2.3 also applies since m is finite. In the presence of 
m-dependence, the conditions can be further simplified. More precisely, routine 
calculations reveal that (Gl) b can be replaced with 

(Gl) m max jeK ||^|| 2ij < oo for q = p2 p+4 , p = |"f)/a"|. 

Let us now return to the problem of uniform bounds where b = b n —> oo as n 
increases. As mentioned earlier, it is desirable to find analogue conditions that 
depend in a more transparent way on b, and are expressed mainly in terms of 
Q. To this end, it is convenient to denote with A j = X°°, ej = e^°, ^ • = <pfj 
and r] k • = 7y£° . For the sake of reference, we then restate the decomposition of 
Xk in this context, which amounts to 

OO ,- 

Xk = ^ y ^jVkjGji -\j = ® e j) ]i Vk,j = {Xk,e.j)Xj ! . (37) 

i=i 

Recall the notion of Q p (k), defined in (18). We then make the following set of 
assumptions. 

Assumption 4.4. Let a > 0, 1 < c + < c _ < 00 and J 4 " < n 1//2-a (logn) - ^. 
Put p* = p2 p+4 , p = [c _ /(2a)"| and b> Co logn for Cq > 0 sufficiently large. It 
then holds that 

(Gl) fi fc (2 p*)<p\ 0 < p < 1, 

(G2) the function A (a;) : x *—> \ x is convex and j c ^ A j < j c uniformly for 

j £ IN, 

(G3) 1/C G < for C G > 0. 

Remark 4.5. Condition j~ c < A j < j~ c+ can also be replaced with e -c J < 
A j < e _c+J , provided that J 4- < logn. Similarly, the convexity condition in 
(G2) can be replaced with max 1<j< j+ 1/ipj ^- c > where we recall that i/)j = 
minlAj-i — A j, A j — Aj+i} (with ipi = Xi — A 2 ). 

Let us elaborate on these assumptions. (Gl) is a weak dependence condition 
that requires a geometric decay, and implies in particular that Q exists. This 
condition is satisfied for a large number of processes in the literature such as 
ARMA and GARCH models. Note that instead of using f2fc(2p*) as dependence 
measure, one could also use mixing concepts like strong mixing or r-mixing 
(cf. [23]). We remark that the method of proof can also be used under the 
weaker assumption of polynomial decay. Unfortunately, this leads to (signifi¬ 
cantly) more restrictive conditions for the eigenvalues A and the range </+. This 
is not surprising, since in this case the bias || Q — Q b \\c is much larger and thus 
more relevant, particularly if b = b n is chosen in the optimal way. In view of 
Lemma 1.1 (see also Lemma 9.2 for a more general version), it seems to be 
impossible to express Assumption 4.1 in terms of Q without additional (heavy) 
assumptions for A and/or J+, simply because the distance \\Q — G h \\c is too 
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large. Condition (G2) imposes regularity conditions on the eigenvalues A. We 
have already seen in the discussion of (C2) in Section 2 that the convexity condi¬ 
tion is mild, and leads to the simple condition J+ < n 1 / 2_a (logn) _1 (see (10)). 
Assumption j~ c < Xj < j~ c+ implies that A fluctuates between polynomial 
decay boundaries. Condition 1 < c + < c~ < oo allows for a large variety here 
though, with a possible varying decay coefficient for Xj. Moreover, as is pointed 
out in Remark 4.5 above, a formulation in terms of geometric decay boundaries 
is also possible. Finally, (G3) reflects the usual non-degeneracy condition al¬ 
ready mentioned above. 


In analogy to (36), we introduce 

= ((g b -g b )(e i ),e j ), Me IN- (38) 

We then have the following first order expansion for the empirical eigenvalues 

A. 

Theorem 4.6. Assume that Assumption 4-4 holds. Then for 1 < J < J+ 


max 
i <j<J 


1 


\b _ \ _ t 

Xj \ A ' A ' 


< 


J 1 / p (n/6) a 


P y/njb 

Next, we state the corresponding result for the empirical eigenfunctions e. 
Theorem 4.7. Assume that Assumption 4-4 holds. Then for 1 < J < J+ 

(oo ,b) 


max 
i <j<J 


i 


\A 7 V 




e i+ 2 


J \\^b 


e j e i Hl 2 


E 


i 


&k 


k,j 


k=1 
k^j 


Xj Xk 


L 2 


< 


jV p (n/6)- a 

y/n/b 


where A j = 

kAi 


\ a\u 

(V^F> 


and we also have 


max 
i <j<J 



2 

]L 2 


OO 


E 

fc=1 


( r(ooA) \2 \ 

(A j-Xk)*) 


J 1 /p{n/b)~ a 

n/b 


As before, we also have corresponding versions of Proposition 2.7 and Corol¬ 
lary 2.8. Formulating the analogues needs a little more care and is not immedi¬ 
ate, so we state them explicitly. To this end, denote with 


(oo ,b) 


(oo,6,l) 

= Vij 


(oo ,b‘ 

% 


where 


(oo,6,l) 

'nij 


E 


fc =i 


^k,i^k,j 

n 


and 


(oo,6,2) 


b n 


E E 

h— 1 k—h-\-1 


^lk,i^k—h,j Vk—h,i^k,j 

Tl — h 


(39) 


Then we have the following results. 
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Proposition 4.8. Assume that Assumption 4-4 holds. Then for 1 < J < Jff, 
one may replace with {(AfeA_ 7 -) 1 / 2 »7£ 0 “’ b ' > jfeeisr in Theorems 4-6 and 

4.7. 

Corollary 4.9. Assume that Assumption 4-4 holds. Then for 1 < j < J+ 

IIA 7 - - XA\ < and || ||e,-— e,-||£ 2 1| < — ■ 
ll j J Hp~ yfo II" 3 3 " h llp~ n 


5. Maximum deviation of empirical eigenvalues 


As already mentioned, Theorems 2.5 and 2.6 can be used to obtain various 
fluctuation results for eigenvalues or eigenfunctions. We exemplify this further 
in case of TA = C, mentioning that a similar program can be carried out for T> = 
C* h Ch , h £ Z fixed. To this end, we formally introduce the longrun covariance 
(recall that X = X — E[X]) as 


7 i j = lim —E 
n—> oo 77 , 


E(4-‘)(4-i) 


■k,l =1 


(40) 


In Section 10.1 we show that this is well-defined given Assumption 5.1 below. 
Moreover, for <rj = jjj we have the usual representation <r| = 4>k,j, where 

<t>k,j = Cov[r?o,j%,jj Vk,jVk,j]- Consider C with eigenvalues A and denote with 

T J = Vn max J^—r —, = max |^|, (41) 

where {^j} 1 < j <J is a zero mean sequence of Gaussian random variables with 
correlation structure £f = (pi,j) 1<i ^ <j; where ptj = In the sequel, 

we show that T J+ is close to T^ + in probability. To this end, we work under the 
following assumption. 

Assumption 5.1. For p > 1 let q = p2 p+i 1 p = [b/a], and assume that 

(El) E[||X fe ||2 2 ] < oo and (C2) hold (with a, f) as above) such that 

(j+) 1/j V a <n~ s , S > 0, 

(E2) n k {2q) < k~\ b > 3/2, 

(E3) infj Oj > 0. 

Note that these assumptions are mild. In particular, the decay rate b in 
condition (E2) is completely independent of the underlying dimension ,/+. We 
then have the following result. 

Theorem 5.2. Grant Assumption 5.1. Then 

sup|P(T r+ < x) - P(T Z . < x)| < n~ c , C> 0. 

Jn ' 





M. Jirak/Eigen expansions and uniform bounds 


19 


The above result provides a Gaussian approximation with an algebraic rate. 
Note that no conditions on the underlying covariance structure are required. If 
we impose a very weak decay assumption on r yx,i,j, we obtain the limit distri¬ 
bution. 

Corollary 5.3. Grant Assumption 5.1, and assume in addition 

l7ij|log(K-i|) =o(l) for \i — j\ —*■ oo. (42) 

Then for x £ R 

lim P(T t+ <u t +(x )) = exp(— e~ x ), 

where u m {x) = x/a m + b m with a m = (2 logm) 1 / 2 and b m = (2log to) 1 / 2 — 
(8 log m)” 1 / 2 (log log to + 47t — 4) for to £ IN. 

Remark 5.4. Note that condition (42) is essentially the weakest possible cur¬ 
rently known, see [47], [48] and [33]. 

Uniform control measures are an important statistical tool and have many 
applications. In the present context, Corollary 5.3 allows for the construction 
of simultaneous confidence bands for A j. This in turn is very useful to assess 
parametric hypothesis and decay rates of the structure of A. A particular and 
important case is the determination of relevant principle components. A huge 
number of stopping rules have been developed in the literature (cf. [43], [45]), 
which all require a uniform control of A. As pointed out by a reviewer, Corollary 
5.3 can be particularly useful in case of threshold rules like the scree plot, see 
also [5] for related problems. 

6. Applications 

A huge bulk of testing and estimation problems in FPCA is related to the 
normalized scores {r]k,j}ke'z,jeK in some way or other, where the associated 
operator is either C/, or Q. Among others, we mention (two) sample mean tests 
and related problems ( [38], [39], [51]), tests about potential serial correlation, 
stationarity and related issues ( [5] , [25] , [27], [37], [40], [46], [58], [56]), various 
change point problems, [4], [7], [36], and many more. Given a sample of size n, 
the canonical estimator of the scores is their empirical version 

Vkj = (X k ,ej)(Xj)~ 1/2 , 1 < k < n, 1 < j < J+. 

Intuitively, it is clear that the power of tests or estimation accuracy is augmented 
if increases with the sample size, since more and more information is taken 
into account. From a theoretical statistical point of view, this can be made 
rigorous by minimax theory for estimates and Ingster’s (minimax)-theory for 
tests (cf. [42], [34]). In [26], a striking example is presented where a very large 
amount of principal components is required to adequately describe the data, see 
also [14]. Let us also mention that the necessity of uniform control of A and e 
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also arises in the completely different field of machine learning in the context of 
techniques based on Reproducing Kernel Hilbert spaces , see for instance [9]. All 
this highlights the importance of a uniform, accurate control of A and e as 
increases, and the usefulness of results like Theorems 2.5 and 2.6. 

Let us briefly discuss how this relates to our main Assumption 2.2. Due to 
its general formulation, (Dl) is very flexible. In particular, all the problems 
mentioned above can be reformulated in a (general) framework (depending on 
the problem and corresponding operator) such that (Dl) is valid. Regarding 
(D2), the convexity assumption (9) leading to (10) provides a general and sim¬ 
ple condition that is recommended for all the applications. In particular, the 
resulting range Jff of potentially allowed principal components is quite large. 
(D3) typically reflects a non-degeneracy condition, which usually is necessary 
any way in the problem at hand. We do not take this discussion any further, 
but rather investigate two other applications a little more detailed. The first 
one is the functional linear model, which contains in particular first order au¬ 
toregression in Hilbert spaces (coined ARH(l) or FAR(l)). As a second, very 
different application, we survey how and why long-memory situations can arise 
in a functional context and how this relates to our results. 

6.1. Functional linear regression 

A fundamental regression model in a high-dimensional context is the functional 
linear model. Given X = {X k } kG %,, Y = {Y k } k ^ £ L 2 (T), the basic model is 
defined as 

X k = $(Y k ) + e k , fceZ, (43) 

where <I> is a (bounded) linear operator, mapping from L 2 (T) to L 2 (T), and 
e = {efc}fc e z £ L 2 (T) is a noise sequence. The goal is to recover 3?, given 
X and Y, while the noise £ is unknown. Observe that estimating $ is an ill- 
posed problem, see e.g. [17] for a more detailed discussion. Model (43) and its 
many variations have been extensively studied in the literature, with active 
research persisting (see e.g. [41]), and it would be impossible to survey all the 
results. From a theoretic perspective, a significant part of the current literature 
(cf. [13], [16], [20], [29], [30], [53] and the extensive references therein) focuses 
on the case where Y and e are mutually independent (which excludes ARH(l)), 
and in addition X k , &(Y k ), e k are all real-valued. Hence by Riesz-representation 
$ (•) = (x^, ■} for some x^ £ L 2 (T), and it all boils down to the estimation of 
x 

Let us touch on the main idea for estimating 3>. Denote with C v the covariance 
operator of Y with eigenvalues \ v and eigenfunctions e v . For the remainder of 
this section, we assume that e = {efejfegz £ L 2 (T) is an IID sequence, and for 
each k £ Z, e k and Y k are independent. Applying Fubini-Tonelli we get that for 
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j e IN 

r(ej) = E [{Y k ,e))X k \ = E [<Y fc ,e")$(Y fc )] +E[(Y k ,e y j )e k ] 
= *(E[(Y k ,e v j )Y k ])=\ y j *(e y j ). 

Hence we obtain the alternative representation 

00 00 \ v $(e y ) 00 Y(e y ) 

*<•> = E *(<«”. >?) = E = E •)■ 


j=i 


j=i 


1 X j 

j=i J 


(44) 


The advantage of this representation is that all involved quantities can be esti¬ 
mated. Given a truncation parameter b £ IN, this motivates the estimate 


£6^ ^ (Yk,&j)X k ^ ^ 

<P (•) = > — ^^-(e^, •), b = b n —> 00 as n increases. (45) 

j =1 71 fc=i X j 

In special cases, it is known that (a version of) is sharp minimax optimal 
(cf. [53]), and adaptive in slightly more general situations (cf. [20]). The con- 
struction of $ illustrates the necessity of an accurate control of A and e y . 
We remark that Proposition 2.4 is very useful in this context. Not only can it 
be used to obtain precise bounds for prediction errors or the actual estimation 
error ||3? h — $||c itself, but also for deriving various limit theorems for functions 
of which requires exact expansions. Limit theorems in turn are required for 
goodness of fit tests or the construction of confidence sets. 

Let us now consider the setup where Y k = X k -i 1 which is exactly the case 
of an ARH(l) process. Note that for p £ IN finite any ARH(p) process can be 
reformulated as an ARH(l) process by changing the underlying Hilbert space, 
see [11] for details. Below in Corollary 6.2, we provide simple yet general condi¬ 
tions that imply the validity of Proposition 2.4 for ARH(l)-processes. In view of 
the discussion about the convexity condition in (9) leading to (10), providing a 
general and simple condition, we only touch on the validity of (Cl). Regarding 
the operator <&, we assume that it possesses the spectral decomposition 

OO OO 

*(-) = EE4<>. (46) 

3=1 3=1 


with eigenvalues A^ and eigenfunctions eA In the sequel, let © be any operator 
with eigenvalues A f) and eigenfunctions e e satisfying the spectral decomposition 

OO OO 

©(•) = E A i< e i--> e i> E x 3<°°- ( 47 ) 

3=1 3=1 


Natural candidates for © in our framework are of course the operators C* h C k or 
Q b . We have the associated usual decomposition of X k , given as 

Xk = E \ EE V k e X J = K[(Xk,el ) 2 ], = (Xk, e®)(A*r 1/2 - 


3=1 
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Similarly, denote with C e the covariance operator of £& with eigenvalues A e 
and eigenfunctions e e , and consider the decomposition ek = 
k £ Z. We make the following distributional assumption for e*,. Given q > 1, 
there exists a q' > q and a constant C q > 0 such that 

\/x £ L 2 (T) with ||cc||l 2 = 1 it holds that || (ek, x) ||^ 9 < C*q (11 (efe, a;) || 2 )^ • (48) 

Condition (48) is mild and allows for a certain invariance in or results, see below 
for more details. A general example satisfying (48) with q' = q is the follow¬ 
ing. Suppose that for each fixed fc £ Z, {e/cj}jeiN forms a martingale difference 
sequence with respect to some filtration •. Elementary calculations together 
with Burkholders inequality then yield the validity of (48). Note that since the 
scores of a covariance operator always have zero correlation, demanding an un¬ 
derlying martingale structure is a reasonable assumption. Observe that in the 
Gaussian case, we even have that {cfcjjjeiN is IID, which is a common assump¬ 
tion in the literature. Next, recall the notion of weak dependence introduced in 
Section 3. We then have the following result. 

Proposition 6.1. Assume that 3>, © satisfy representations (46), (47). If 
E[||e fe || ^ 2 ] < 00 , then X is a stationary Bernoulli-shift process which can he 
written as Xk = ^*( e fc-i)- V addition {efc}fcez satisfies (48) for some 

2 < q < q', then 

max||77fcj - (Vk,jY\\ q ~ 0 < p < 1, k £ IN. (49) 

Note that the geometric contraction property in (49) is independent of the 
underlying orthonormal basis e , which is a desirable property. A check of the 
proof reveals that this essentially follows from condition (48). We also remark 
that Proposition 6.1 can be extended to more general ARH(p)-processes using 
the same method as in [11]. 

Denote with C x the covariance operator of X, and let 0 = C x . We then 
obtain the following result. 

Corollary 6.2. Grant the assumptions of Proposition 6.1 and let © = C x . 
Then there exists a universal constant C c and universal sequence s^ < n -1 / 4 
such that (Cl) holds. 

A related result can be established for © = Q b , we omit the details. 

6.2. Weak and long memory in econometric and financial timer 
series 

In the presence of serial dependence, the covariance operator C as a single object 
is not so relevant in the context of a CLT, and the long-run operator Q is the 
key object. However, this can be entirely different if only serial dependence is 
present, but essentially no serial correlation, which is often the case in financial 
or econometric time series. More recently, there has been considerable activity 
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(see for instance [6], [28] and particularly [55]) to model financial or econometric 
time series with the help of FPCA. In this context, it is well-known (cf. [10]), that 
(differenced) stock returns often display a martingale like behavior, which forms 
the basis for many financial discrete time models (e.g. GARCH) and continuous 
time models (e.g. semimartingales). On the other hand, it is equally known 
that the absolute or squared returns display a completely different behavior, 
and sometimes even exhibit long memory (cf. [24]). As a general example, let 
us consider the case where {e k }k£i, is an IID sequence in L 2 (T), {X k }kex, 
{Yk}k& £ L 2 (T) are stationary and satisfy the structural equation 

X k =e k Y k ~ i, keZ, Y k G £ k with £ k = j < k). (50) 

Note that the GARCH-model is a special case of (50), see also Example 2.4 
in [35]. Observe that X k is a martingale difference sequence with respect to 
£ k . On the other hand, X% (or |Xfc|) can behave completely differently due 
to {Efc}fcgz, as is desired from a modelling perspective. This becomes relevant 
for the estimator C. While we still have by the martingale CLT (up to mild 
regularity conditions) 

n 

fc=l 

the standard estimator C as in (2) in contrast is based on X%. Depending on 
the behavior of {Y k } k& r,, we may thus witness the full palette of dependence 
when employing C , ranging from independence to weak dependence or even a 
long memory behavior of X 2 . Due to the high degree of flexibility in (Cl), our 
results thus provide the necessary tools for a more detailed analysis of the model 
in (50). 

7. Proofs of asymptotic expansions 

We introduce the following additional notation. Given functions f,g £ L 2 (T) 
and a kernel K(r, s), we write 

I fg= [ f{r)g(r)dr and f K fg= f K(r, s)f(r)g(s)dr ds. (51) 
Jr Jr Jr 2 Jr 2 

If we have / = g, then we write / 2 = /(r) 2 and otherwise // = f{r)f(s) in 
the above notation. We interchangeably use (•, •) and fj- ■, the latter being more 
convenient when dealing with kernels. We also frequently apply Fubini-Tonelli 
without mentioning it any further. Next, we introduce the empirical kernel D 
and its analogue deterministic version D as 



M. Jirak/Eigen expansions and uniform bounds 


24 


We first establish the transfer result of Proposition 2.4. 

Proof of Proposition 2-4- Due to E[||Xfc||£ 2 ] < oo, standard arguments (cf. [32]) 
reveal that C exists and satisfies (5) and (6) with eigenvalues A and eigenfunc¬ 
tions e. Moreover, we have that C is of trace class. Since to = n, by virtue of 
(C2) and since E [i] k j\ = 1 f° r 3 e IN, we only need to verify (Dl) . Due to (Cl) , 
it suffices to establish a bound for ||r/^|| g . However, using (7), Cauchy-Schwarz 
and (Cl), the claim follows. 

□ 

We now turn to the proofs of Theorems 2.5 and 2.6, which are developed 
in a series of lemmas. As starting point, we recall the following elementary 
preliminary result (cf. [11]). 

Lemma 7.1. For j ^ k we have the decomposition 

•\j J e fc( e i - e i) — J ek{ej - ef) 

+ [ (D-D )e fc ej + [ (D - D)e fe (e i - ej). (53) 

Jr 2 jt 2 

Rearranging terms, we obtain from the above that (provided A*, ^ A j) 


IT 


A j A k 


'T 2 


(D - D)e k ej 


+ j (D D)e/c(ej ej) (Aj A j) j e k (ej ej) 


'T 2 

def 1 


A j A k 


Ik,j + Hk,j + HIk,j ) > 


(54) 


and 


e k (ej ej) — 


—Afc + A j 


A,- — A,- + A,- — \ k A j ^k 


Ik,j + Hk,j ) • (55) 


Due to the frequent use of relations (54) and (55), it is convenient to use the 
abbreviation 


E k ,j 



(e k ,ej ej) 


in the sequel. We also recall the following lemma (cf. [11]). 
Lemma 7.2. For any j € 1N we have 


IT 


e j) e j 


1 ||_ I. 2 

2 ll e J - e A 11 l 2 


and 



e j) e j 


1 ii^ || 2 

2 ll e i — e i 11 l 2 ' 


We proceed by deriving subsequent bounds for I k j,II k j and 11I k j. 
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Lemma 7.3. Assume that Assumption 2.2 holds. Then for 1 < q < p2 p+4 we 
have 


ll^fcj llq 171 1/2 V^A7 uniformly for k,j G IN. 

Proof of Lemma 7.3. Using the orthogonality of ej, e k we have 

h,j = J 0 ~ D)e fc ej = m- 1/2 \J\ k X j m 1/2 (rjfj + r^-), 

hence the claim follows from (Dl), Lemma 2.1 and (D3). □ 

Lemma 7.4. Assume that Assumption 2.2 holds. Then for 1 < q < p2 p+3 we 
have 


WWV-'DWcW'jZm- 1 / 2 . 

Proof of Lemma 7.f. Since the Hilbert-Schmidt norm dominates the Operator 
norm, Parsevals idendtiy and Lemma 7.3 yield the claim, using that (D3) sup¬ 
plies Yl'jLi Xj < oo. □ 

Lemma 7.5. Assume that Assumption 2.2 holds. Then for 1 < q < p2 p+4 and 
k G IN we have 


I Ih 


max 


k,j I 


iII e j - e j IIl 2 
Proof of Lemma 7.5. It holds that 


< m- 1 ' 2 . 


Hk,j = / (6 - D)e fc (ej - ej) = Y \JX k X t ( 77 ® + (56) 

J T> i= i 

Since J2Zi E h = 11% - e,-|| 2 2 by Parsevals identity, the Cauchy-Schwarz in¬ 
equality gives 


Y \!Xi E i,j (riV.z + vf.,) < ( Y Xl (Vkd +r ik ^ 2 


1/2 


(57) 


Hence the triangle inequality, (Dl) and Lemma 2.1 together with (D3) yield 

| Ih,;' X 1 / 2 


max 


1 <i<j£. Il e i - OlliL 2 


rr / 00 „ \ J 

< m~ 1 / 2 \JX k ( Y X i m \\ + r ik,i ) 2 \\ q/2 ) 

1 ' 

< m^ 1/2 J\ k < m~ 1/2 \/~X^. 


□ 
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Lemma 7.6. Assume that Assumption 2.2 holds, and let Aj = {|Aj — Aj| < 
ipj/ 2}. Then 


max PlA^) T, rn 
i <j<Jt V 


< m-“P 2P 


Proof of Lemma 7.6. Proceeding as in Lemma E.2 and E.l in the supplement 
of [34] (or likewise Lemma 18, Lemma 16 in [52]), it follows that for some 
absolute constant C > 0 


p (^)<p( E 


T 2 

1 k,l 


1 2 . 
3,3 
2 


|Afc — Aj||Az — Aj| if) |Afc — AjIV’j 


oo r 2 

V fc.j 

2^ i\, _ 


> C 


k.ifj kfj 

Let p* = p2 p+4 . Then by the triangle inequality and Lemma 7.3 


max 
i <i<Jf 


E 


T 2 

1 k,l 


k 1=1 l Afc A lH A * A il 
k’dAi 


< 


max 


p*/2 1 <3 Jrr 


Similarly, we get that 


max 
1 <3<Jt 


1 2 . 

3,3 


% 


A 


2_ < 


max 


< max „ ^ ....... . , 

p*/2 1 <3<Jt 1 <j<j+ V V m I Afc - Aj 

kAi 


1 y A fc 

l Afc _ A J'I 

kAi 


1 

/m • J I A — 


(58) 


(59) 


and also that 


max 
1 <j<Jm 


E 

k= 1 
kAi 


I 2 

2 k,j 


IA k - Aj#j 


< 


max 


A j 1 Af; A j 

P */2 ~ i<7<7+ Vmifj yfm ^ |A fc - Aj| 

kAi 

A fc N 2 


< max If 

i<j<j;fc \v TO ^ 


zj l Afc A i 

kAi 


(60) 


Observe that due to (D2), (58), (59) and (60) are all further bounded by < 
m~ 2a . Hence we conclude via Markov’s inequality and the triangle inequality 
that 


max 
i <j<Ji 


P(Aj) 


< m~ ap 


which completes the proof. 


□ 


The next result is our key technical lemma. 


Lemma 7.7. Assume that Assumption 2.2 holds. Then uniformly for 1 < q < 
p2 p / 2 + 3 , k G IN and 1 < j < J+ 


| Hk,jl(Aj) | 


< 


\/A /, A :) 


m 


llL 2 | 


2g 
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Proof of Lemma 7.7. Note first that by construction of Aj. we have that 


Xj - A i 


' 1 (-Aj) 


<2, for l^j. 


X j — Xj + Xj — Xi 

Using the decomposition in (55) and bound (61), we obtain that 


(61) 


Kl(A)| < p^(IAil + l^l)l(A)- ( 62 ) 

We now use a backward inductive argument. Let pi = p2 z , r > 0, and suppose 
we have uniformly for k 6 IN 

||//fcjl(ylj) || p < m~ 1 / 2 V%c"(\/A7+m _r ) for some * < p + 4. (63) 

Then we obtain from (62), the triangle inequality and Lemma 7.3 that for l ^ j 

ll^.ih-UIL - m 1/2 ,Ai +m ~ r ) ~ ( 64 ) 

Using decomposition (56), Cauchy-Schwarz and Lemma 2.1 together with 
(D3), we get 


I Uk,jl{Aj) | 


< 


\/Xk \/ r Xi\\Ei J l(Aj) | 




;=i 


hence we obtain from Lemma 7.2, inequality (64) and (Dl), (D2) that 


l^l(A)|| Pi _ 1 <^(v / A-||ll^ ej\\U Pi + |A| - Aj| 

^(|| ||% — e j\\i? || Pi + m a )V^ + m “ r ^, 


1 ^ M\A/ + m T ) 


< 


(65) 


and this bound holds uniformly for k £ IN. Observe that we have now shown 
the validity of relation (63) with the updated value r = r + a, but with respect 
to pi -1 instead of pt. Since A j > m _l1 with 6 > 1, it follows that after at most 
p/2 + 1 = [fi/a]/2 + 1 iterations we have 


where q* = p2 p / 2+3 . By Lemma 7.5, relation (63) is true for r = 0 (hence 
m T = 1) and i = p + 4, constituting the basis induction step, hence the proof is 
complete. Note that we have also shown 


|| g . 


< -i/2 

~ |Ay - A,|’ 


( 66 ) 


which is of further relevance in the sequel. 


□ 
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Proposition 7.8. Assume that Assumption 2.2 holds. Then for 1 < q < 
p 2 f>/' 2+2 we j lave un if 0 rmly for 1 < j < J+ 


I e j e , 


IIl 2 I 


< 


p(^) 


1/9 


m 


fc=i 


A v Afc 


(A k - A j) 2 


< 


-2a 


kAj 

Proof of Proposition 7. 8. The triangle inequality and Cauchy-Schwarz give 

||II®/ - e illL 2 ||q < 2 P{A c j ) 1,q + || \\ej - e,||^l(A')||q- (67) 

We now invoke the ’traditional’ way of bounding ||e"j — ejH^, (cf. [11], [38]), 
which uses the inequality 


\ e j ~ e j\\h 2 


< 2 XXr 


( 68 ) 


fc =i 
kAo 


Hence using (66) and the triangle inequality, we obtain from (D2) that 

oo oo . . 

||||%-e j ||£. l (A)IL <*EI*s,i(4,)|,S-E -. ' 


k =1 
k ^3 


q ~ m (A j - A ;) 2 

kj^j 


Combining this with (67) gives the first inequality, Lemma 7.6 and Assumption 
2.2 yield the second part. 

□ 

Note that a < 1/2 and hence p/2 > f) > 1 and 2 P//2+2 > 8. Since 
||l|ej-ej||£a|| 2g < V^||||ej-ej||^||J /2 for q > 1, 
we obtain the following corollary to Lemma 7.7. 

Corollary 7.9. Assume that Assumption 2.2 holds. Then for 1 < q < 8p we 
have uniformly for k € IN and 1 < j < 


\ n kJ < 


-a 


m 


Proof of Corollary 7.9. Lemma 7.5, Lemma 7.6, Lemma 7.7 and Cauchy-Schwarz 
give 

\\HkjL < phjliAM + ||/4jlK C )L £ + ^ m ~ a P(Ap /2q 


< 


3) II q 

a/AjA k a \/\ k a - a p2 f+3 /q 

— — -j=.—m H-—m m p /q . 

'm Jm 


Since ap2 p+3 /<7 > a2 p > f), we have m a P 2P+3 /9 < A 7 + by (D2) and the claim 


follows. 


□ 












M. Jirak/Eigen expansions and uniform bounds 


29 


Lemma 7.10. Assume that Assumption 2.2 holds. Then for 1 < q < 4p 


— Xj — Ij t j 11 < —f=m “, and 11Aj — Aj|| <—^=, uniformly for 1 < j 


q ' \/m 

Proof of Lemma 1.10. We have that 


Xj — j TAcjCj — j TA(ej e j) e j + / TJcjCj 


Jr 2 


IT 2 


Jr 2 


— Xj / (ej 6j ) e.j + / (D D )ejej + / TXejej 

Jr Jr 2 Jt 2 


2 \\ e i e J11 l 2 


Since by Lemma 7.2 


IT 2 


(D — D )ej{fij ~ej)+ (D — D^-e^ + / D ej 


IT 2 


it 2 


It 2 


^ e j e j — I D e ji e j e j) + / Tiejej — 2 ll e i e J lit . 2 » 


ir 2 


it 2 


we obtain by rearranging terms (if ||e/ — ejWfp <2) 
2 


3 3 2 _ \\ e o — e j IIl 2 \Jt 2 


(D - D )ejej + (D - D)e, (e, - e^-) 


lr 2 


2 ll e t 11 £ 2 


{ijj + Hj,j) ■ 


(69) 


Let Bj = {||e, — e 3 j|/ 2 < l}. By Lemma 7.3, Proposition 7.8 and the Cauchy- 
Schwarz inequality we obtain 


I, , I 1- 


2 ll e i e j 11 l 2 


< 


| 29 I|H®J “ e illi. 2 1| 


29 


< m ~ 2a . 


m 


Similarly, Corollary 7.9 yields that 
//,,(i- 


2 ll e i e i\\iJ 2 


1(3,) 




(70) 


(71) 


Let T> = {||2? — X>|| < l}. Lemma 7.4 and Markovs inequality then yield that 


P( V c ) 


< m~ 2ap2 


p/2+3 


(72) 


On the other hand, Proposition 7.8 implies that P{B C j ) < m 2o p 2P/2+2 . Since 

1) > 1,1/2 > o we have 2 p / 2 > 1/2 + l/4a+ h/2a and hence m~ 2a2V/ < 

?7i — I/ 2- °Aj+ by (D2). Combining (69), (70), (71) and (72) we obtain from 


<Jl- 
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the Cauchy-Schwarz inequality, Lemma 1.1 (see [11] for a general version) and 
Lemma 7.4, that 

l|A; - ^ - I jtj || ? < + || ||© - ©|U|| 2 q P(V c ) 1/2q + 

\Jm 

which gives the first claim. The second claim follows from Lemma 7.3. 

□ 

Lemma 7.11. Assume that Assumption 2.2 holds. Then for 1 < q < 2p we 
have uniformly for k £ IN and 1 < j < 


\nh A (A) 


I < Aj \/AfcAj 

9 ~ m |A k - Aj| 


< 


x/AfeAj a 


Proof of Lemma 7.11. Recall that IIIk,j = (A j — Xj^Ekj. By the Cauchy- 
Schwarz inequality and Lemma 7.10, we have that 




< 


A 




\2q- 


Hence the claims follow from inequality (66) and (D2). 

□ 


For the sake of reference, we state Pisiers inequality. 


Lemma 7.12. Let p > 1 and Yj, 1 < j < J be a sequence of random variables. 
Then 


| max |Y)||| < 

'i <j<j 1 j '"p ~ 


j 

El 

i=i 


Y, 


i/p 


< J 1/p max ||yJ 




We are now ready to proof Theorems 2.5 and 2.6. 

Proof of Theorem 2.5. This readily follows from Lemma 7.10 and Lemma 7.12. 

□ 


Proof of Theorem 2.6. We treat the first claim. By Lemma 7.2 we have the 
decomposition 


lk.j T H/r.j T IHk.j def 


e j - e 3 — ~Trl! e ! - e l||lL 2 + E ek ~ -\ — \- — — “A? + Bj. (73) 

fc=1 


A j A k 


Note that by the triangle inequality 


I^iIIl 2 — ll e i 11 l 2 II^IIl 2 — 







M. Jirak/Eigen expansions and uniform bounds 


31 


Let Cj = J2kLi e/; x Ik _l\ k ■ Then another application of the triangle inequality 
k^j 3 

gives 

II e j — e j + A; — ^jIIl 2 — II^'IIl 2 II^'IIl 2 — ^ + II^'IIl 2 ' 

Hence by the Cauchy-Schwarz inequality and Lemma 7.3 


e -j a 


■Aj-Cj I 


]L 2 


!(A c )IU 4 P K c ) 1/p + ^K c ) 


1/ 2 p/ 1 


E 

k =1 


Aj Afc 


(Aj - A fc ) 2 


which by Lemma 7.6 and (D2) (arguing as in the proof of Lemma 7.10) is 
bounded by 


e o ~ e o + Aj - Cj\\izl(Af) || p < m 1/2 0 (A j+ + vE)■ 


Lemma 7.12 and the inequality A j > x 3 ^ > Aj A 1 then show that it suffices to 
consider event Aj. Corollary 7.9 and Lemma 7.11 give 


^ (/4 j +// 4 j ) 2 W/I 
^ (Aj — A*;) 2 ^ 


k =1 

k^j 


A 7 Afc 


p 


hence the first claim follows from Lemma 7.12. Next, we treat the second claim. 
As before Lemma 7.2 yields 


II 2 1 ||_ M 4 

c -j g 7 L r, — \\c 7 g j 


E 


(4,j + Hk,j + IHkj) 2 


3 J|lL2 4" 3 J|lL2 ' ^ (Aj — Afc) 2 

k^j 

Proceeding as in the first claim, one shows that it suffices to consider the event 
Aj. Let T>j = {||Gj — ej || L 2 < m Then proceeding as in Lemma 7.10 we 
obtain 

P(pfj < m- ap2P/2+2 < m~ p ~ 2ap X p + . (74) 

We thus obtain from Lemma 7.3, Corollary 7.9, Lemma 7.11 and (74) 

' 2 *1 (Aj 


I e 0 e 3 ||l 2 


2 


\ e i e 3\\^ J -\^3, 


k^i 


+ P(VC) 


1/p 


^(4,j + 74,j + //4,j) 2 -J 2 JW/i N 
^ (A? ^ Afc) 2 


fc=i 
k^j 


< 




+ m- 1 - 2a \j++m- 1 - a Y J 


A 7 Afc 


fc=i 


(Aj Afc) 


2 ' 


(75) 


1/2 
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Iterating this inequality once and rearranging terms, Lemma 7.3 yields that 


e,- — e 


■3 IIl 2 


-£ 

k —1 
k^j 


T 2 

1 k,j 


(\j - x k y 


i(a-: 


< 


A 


+ 


1 


■, 1 + 2(1 


m 


aE 




k^j 


Since A j > —> A j A 1, an application of Lemma 7.12 yields the desired 
result. 

□ 

Proof of Proposition 2.7. Observe that since E \j)^ 3 ] = 0 for k ^ j, we get that 

h,j = ((©-©) (e fc ), e,) = (C' + <j)~ 

Since Aj = A+Ef+A], the claim follows from (Dl) and routine calculations. 

□ 

Proof of Corollary 2.8. The claim follows from Proposition 2.7 and (Dl). 

□ 


7. 1. Proofs of Lemma 7.13 and Theorem 2.9 


We first provide the following result about the convexity relations of A x . 
Lemma 7.13. If (9) holds, then (10) is valid. 

Proof of Lemma 7.13. For the proof, the following relations are useful, which 
can be found in [16], [21]. 

If j > k and (9) holds, then kXk > j A, and A& — A j > (l — k/j) A*,. 
Moreover, it holds that E A*: < (j + l)Aj. (76) 

k>j 


Now by (76) we have 


E 


AfcA ? 


A,Afc x ' k XqXk, x ^ XjXfc ^ _2 


i __ < -2 A j A k sr h A .i Ak I _ 

fc=1 v -+ ~ 3 h. (* - J) 2 K j) 2 A 2 + A? 

k^j 


—; (a j - x k y 


<+■ 


j>k x k j<k 

In the same manner, one shows that 

A k 


3 2j<k 3 


£ 

k=1 


IA j Xk 


< 


jlogj- 


□ 
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Proof of Theorem 2.9. First note that due to the Gaussianity of X, scores r)k,i 
and r]k,j are mutually independent for i ^ j. Given independent standard Gaus¬ 
sian random variables X , Y , the function XY — 1 is a two-dimensional second 
degree Hermite polynomial. If X = Y, then X 2 — 1 is a univariate Hermite 
polynomial of second degree. We may now invoke Theorem 4 in [3]. The proof 
is based on the method of moments for partial sums of Hermite polynomials. 
In particular, using that sup^-g^ YlkLo Cov(r/oy, Vk,jj < oo (which follows from 
a > 3/4) it is shown via the Diagram formula that for any fixed p € IN 

Vn max \\rjf ,|| < oo and Vnrjf ,• A AA(0,' •). (77) 

Moreover, since a > 3/4 one readily shows that maxj g ]N||n~ 3 / 4 ]C/J =1 %j|| 2g = 
o(l) for any fixed q £ IN. Hence (Cl) holds and using Proposition 2.7 the CLT 
for A j follows. In order to prove a CLT for e.j we proceed as follows. Denote with 


Cj = £ 




Ik, 


k =1 
k^j 


Xj Xfc 


ci* = £ 


&k 


Ik, 


k= 1 
k^j 


Xj A/- 


for d > j. 


Due to Theorem 2.6 and Lemma 7.3, we have that 




o(l). 


It thus suffices to consider Cj. Since J2k>d^ k ~~ 5 ► 0 as d increases, Lemma 7.3 
implies that for any S > 0 there exists ds € IN such that 

V^E [||Cj - C,, ds \\ L 2\ < 6. (78) 

It now suffices (cf. [49]) to establish that for any fixed d € IN (which includes 
the case d = ds) 


yfcC jid ^N{ 0,£ d ), (79) 

where £ IR, d x IR.^ denotes the corresponding covariance matrix. But, since 
we have for i( ^ ji, l £ {1,2} that 

E [nfi = 0 if either h ± i 2 or j x ± j 2 , 

we may apply Theorem 4 in [3] due to sup^-g^ Cov(?7o,j) Vk,j) 2 < c», which 
gives (79). This completes the proof. 

□ 


8. Proofs of Section 3 

For the proof of Proposition 3.1, we require some preliminary results. 
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Lemma 8.1. For p > 2, let {X k } k& % £ L 2 satisfy 

OO 

^||||X fe ^X'|| L2 || p <oo. 
fc =i 

Then 

||||Xi + ... + Jf n || L2 || p < \fn. 

Lemma 8.1 comes as a byproduct of the results in [44], see also Lemma 10.3 
and [63] for the original argument for real-valued sequences, which we also use in 
the sequel. As a next result, we state a special type of Hoffding decomposition. 

Lemma 8.2. Let {Xk}kez, £ H be stationary such that for p > 2 

OO OO 

£11**- x 'k\\ 2p < Ell y * - Y kK < ( 8 °) 

k =1 k =1 

Denote with 

A k = (X k - E[X fc ])E[yi] + (Y k - E[r fc ])E[X!]. 

Then 

(i) [|£i< fclI <«*^ - "EL ^ - « 2 E[AT 1 ]E|V 1 ]|| j , < n, 

(ii) ||ELi^|| 2p <^- 

Proof of Lemma 8.2. Using the Hoffding decomposition 

n 

Y X k Y l= Y {X k -E[X k ])(Y l ~E[Y l ]) +n 2 E[X 1 ]E[F 1 ] + n^A fc , 

l<k,l<n 1 <k,l<.n k— 1 

claim (i) follows from the triangle inequality, Cauchy-Schwarz and Lemma 10.3. 
Claim (ii) follows directly from Lemma 10.3. 

□ 


Proof of Proposition 3.1. Let us first mention that the assumptions of Proposi¬ 
tion 3.1 clearly imply those of Lemma 8.1 and Lemma 8.2. As another pre¬ 
liminary remark, observe that E[||Xfc||^ 2 ] < oo implies that Ch exists and 

X k = Eyli ^ 1 / 2 ' r 1k,j e j w ith Ej°u A? < °°- N ex C denote with 

r)% = -(AiAj)" 172 ^),^-) + r 7 ®., i,j £ F. (81) 

Employing Lemma 8.1, lengthy routine calculations reveal that (here condition 
b > 3/2 is helpful) 


E 


Y < Xl + h - x ™> x k+h - X n ){X k -X nr ) (Xt - x n ) 


l<k,l<n—h 


- Y, ( x i+ h _ X k+h - p) (X k — p, -)(Xi — p) 

l<k,l<.n—h 


< nP. 


(82) 
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we spare the details. Observe next that we have the representation 

(Xi+h ~ P,X k+h - n){X k - p, -)(Xi - p) 

l<k,l<n—h 


— ^ ^ ^ ^ V ^2 ^rVl+h,rVl,iyk+h,rVkj(ei, -)ej. (83) 

1 <k,l<n—h i,j —1 r— 1 

From the triangle inequality and Cauchy-Schwarz, we obtain 

max \\ r U+h,rVl,i ~ ( r ll+h,rVl,iy\\or t ^4p(l + h) +f2 4p (i), l, h € IN. (84) 

i,r£lN r 


Hence by (82) and Lemma 8.2 (i) (applicable by (84)), A r < oo, we obtain 

< n- 1 / 2 . 


n 1 / 2 max 11 1 

ijew" ’ n 


Finally, using Lemma 8.2 (ii) (applicable by (84)) we get 

n 1 ' 2 max 11 rT®-11 < oo. 

ijew" ,j "p 


Finally, we remark that the same calculations used to derive (22) also reveal 
that E[tj®] = 0 for i ^ j. Hence (6) holds, which completes the proof. 

□ 

Proof of Theorem 3.3. Note first that an application of Lemma 8.1 together 
with routine calculations gives 

\\\\C h -C h \\ c \\ p , <n- 1/2 , l<p , <p2 f>+2 . (85) 

Let us make the decomposition 


/, - fj = ( A y 2 fj - A Tfj + (A T - AV")/- 


1/2. 






a} /2 -a 


1/2 


,!/2 


(A 


■3 Aj) 1 / 2 


and also 


A ) ,2 fj - A ) /2 fj = C h (e 3 ) - C h ( ej ) 

= C. h {e.j - ej) + {C h - C h ) (ej) + (C h - C h ) (ej - e^). (86) 

Using (85), elementary computations yield 

||I|aV 2 ^-aV 2 /,|| l2 || p , 

— ||^/i1| £ || W e j ~ e i IIjL 2 lip' || 11^ _ C fc |Lc|| 2p ,(l II ll e J ~ IIl 2 112q) 

1111% - SjIIl 2 || 2p / +n~ 1/2 , 1 <p' < P 2 p+2 . 


( 87 ) 
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Next, for j G IN consider the set Cj defined as 

Cj = {X j >X j /2}, P(q) < n~ 2p j e IN, (88) 

where the bound for P(C|) follows from Markovs inequality and Lemma 7.10. 
Since ||/j||l 2 = ||/j||l 2 = 1, we thus obtain 

\\\\fj - fj 11 l 2 ^-Cj ||p/ < 2||lc=|| p , < n~ 2p/p , p' > 1. (89) 

Similarly, since Ch is a bounded operator, the triangle inequality, Cauchy- 
Schwarz, Markovs inequality, (85) and (88) yield for 1 < p' < p 


||II(A? - X j )f j /(2X 1 / 2 ) +C h (e j - e 0 ) + (C h - C^)(e,-)|k 2 lq|| p , 

~ ll*i“ A j|l2p'll 1 C J e ll2p'/ A ) /2 + ll 1 C|llp' + II W^h ll/I |]gp/1| || 2p / 

< n- 1/2-p/p' A 1/2 + n - 2 P / P ’ + n -H 2 - p/p ' < n -3/2. 

_^ /2 

Multiplying with , we see that it suffices to establish the claim on the set 
Cj. To this end, observe that 


^1/2 _ ^1/2 _ A j - X j 


2A 


1/2 


(A,- - A,) 2 . 

< ——. j e K. 


2A' 


3/2 


(90) 


Then (87), (90), Cauchy-Schwarz and Lemma 7.10 yield 


,1/2 _ Vl/2 

lNl/2? \l/2 f , f \ 1/2 Nl/2\ r II a j A j i 

A , fj A + { A. A. UjWjj,— - lr 


(aa -) 1/2 


< 


(||l|e 7 --e j || L 2 || 4p , +n 1/2 )(X jn ) 


- 1/2 


1 < p' < p. 


(91) 


Using (85) and (86) together with Cauchy-Schwarz, (90) together with Lemma 
7.10 and combining this with (91), the triangle inequality gives 


< 


fj - fj 
1 


(Xj — Xj)fj Ch(ej e i) + (Ch Ch)(ej) 


2Xj 


. 1/2 


L 2 


1 C, 


P' 


I e j e lllL 2 | 


4 p' 


y/Xj ' 1 


l 

+ -. 

n n 


(92) 


□ 


9. Proofs of Section 4 

Proof of Theorem f.2. Since XAez ||£/t||,C < oo, G b exists, and by C* t = C-h, 
Q b is symmetric. Hence by the spectral theorem, (31) holds. Together with (34), 
this gives (5) and (6). It remains to derive a bound for To this 
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end, put rj b = g b {n) = (X n — p,e b )(X b ) 1 ^ 2 . Since b = o(n), routine calculations 
then reveal the upper bound 


I rj^ 

I '*,j 


I <1 

'q ~ n 


b 

E 

h =1 


I E 

k=h -\-1 


Vk,iVj | 


I E rivl-hj 

k=h -\-1 


■ n lkEI 


(93) 


Using Cauchy-Schwarz and (Gl) b , the claim then follows. 

□ 


Unfortunately, the proofs of Theorems 4.6 and 4.7 turn out to be lengthy 
and technical. To explain how and why, let us briefly elaborate on the main 
difficulties and how they can be overcome. The main objective of course is to 
transfer everything to Theorem 4.2. This means that we need to show that 
Assumption 4.4 implies Assumption 4.1. But here the main problem arises. For 
instance, even though we can control the difference \\Q b — Q\\c quite well, this 
is not sufficient to guarantee the validity of (G2) b . The problem here is that we 
need to control the whole sequence {A b } je ]N with the help of {A./}•,£«, but this 
is impossible if j is very large. Related difficulties arise for (Gl) b and (G3) b . In 
order to circumvent these problems, we first work with the truncated sequence 

Xf = E \A jVkjZj, T G IN, k G Z. 

3 = i 

The key reason why this works is the simple fact that truncation does not change 
the first r eigenvalues and eigenfunctions. Let us elaborate on this more detailed. 
Define the truncated long-run covariance operator as 

CT(-) W 5>[<XI,.)XU]= E yf&i'PijMej =EA j (e j ,-)e j! (94) 

hez ij =i j =i 

where the last equality follows from (35) with b = oo (<*9°° = '-fij). Observe that 
this last equality implies that {Aj}i<j< T and {(ij}i<j<r are also eigenvalues and 
eigenfunctions of Q T . This is a key observation that we heavily use in the sequel, 
and therefore state as a lemma for the sake of reference. 

Lemma 9.1. The first r eigenvalues and eigenfunctions of the truncated co- 
variance operator Q r as in (94) are {Aj}i<j< T and {ej}i<j< T - 

The main strategy for the proofs of Theorems 4.6 and 4.7 are now the fol¬ 
lowing two steps. 

(Step 1) Verify Assumption 4.1 for {Affcji<fc< n - 

(Step 2) Control the error of replacing {Xk}\ <*,<„ with {Xf}\ <*,<„. 

(Step 1) will require most of our attention. In order to deal with it, we introduce 
the truncated version of Q b , namely 

q°(-) = E \AEE7’ b) ( ei >‘> e u where k7’ fc) = E e [ 7? m 7 ?oj1 ( 95 ) 

*,.7=1 \h\<b 
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Observe that this is a linear, symmetric Hilbert-Schmidt operator. We also de- 
note with Q the truncated estimator, which we define as in (28) with Xk 
replaced by X%. Denote with A®, A*, e*, e® and the analogue 

quantities, and also put 

= -&)&),e?), 1 

Obviously has finite rank bounded by r, and hence A* = 0 for j > t and 
span({e j}j>r) Q Ker(^°). This implies lm(0 <> ) © 5° = span({ej}i<j< T ) for 
a linear subspace <S° C Ker(<?°) such that dim(Im(<5°)) + dim(«S°) = r, and 
thus we get 


x l = E( A i) 1/2 C- e i> where W 1/2 nh = ( X Z> e P> A i = E [( x k >^) 2 ]• 


i=i 


Throughout the remaining proofs, we make the following convention. 0 < p < 
1 is an absolute constant that may vary from line to line. We write 

t = n 1 , 0 < t < oo, (96) 


and often use the expression ’for sufficiently large (but finite) t, Co > O’, where 
Cq only depends on c _ ,c + ,t (recall b > Co log n). There is no danger of ’circle 
arguments’, we always pick t first, then Co- Next, we consider a more general 
version of Lemma 1.1 (cf. [8], [11]). 

Lemma 9.2. Let Q, TL be linear Hilbert-Schmidt operators with eigenvalues 
{Af} ieM and eigenfunctions {ef} jeK , {ef} igK - If H is positive 
definit, symmetric and A^ > ... > X^ +1 , then 




< \\g-n\ 


| e G — e H \\ < 
ro IIl 2 — 


2v^, 

W 1 


g-n\ 


where = min{Aj i _ 1 — Xf, X ^ — X(? +1 } (with ip^ = A^ — A ^). 

In the sequel, all operators Q, H in question will satisfy the conditions of 
Lemma 9.2. The next lemma is our main tool box and summarizes most technical 
preliminary results we require in the sequel. To this end, recall the notion of 


Vl 


Vi 


and r]\ 


(39). 


Lemma 9.3. Assume that Assumption f.f and condition (96) hold. Then for 
1 < 9 < P* and sufficiently large t, Cq > 0, we have 


(i) y/n/b\\ri^’ b ’ 2 ) \\g} < 

(ii) max,;j e]N |E[?7fe,i?7oj] j < p k , 0 < p < 1, 

(m) \\g b - g*\\ c> \\\\g b - d'hWg Sn- {c+ - 1)l , 

(iv) \\G T -Q°\\ c <p b ,n<p<l, 

(v) \\\\g b -g b Wc\\ q , Wild*-g°Wc\\ q < 


oo, 
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<n 2 A ; 


(vi) maxi< J -< T ||e J - - e*|| L2 < p b , 0 < p < 1, 

(vii) ma,x 1 < j < J +\\e b j - e?\\ h2 , max is < j+ 1| H't,- ~y\\^\\q ~J‘ 

(viii) maxi<,,< T {Aj/A^A^/Aj} < 2. max-i<j < r {A y / A ;; . A ;; A ; | < 1. 

Proof of Lemma 9.3. Throughout the proofs, we frequently use representations 
(31), (34) of Q b , Q , and an analogue representation for <?°, Q (see (95)). Claim 
(i) follows from (Gl) and the results in [64], [33], which are actually much more 
general. Claim (ii) can be established with the same arguments as in the proof 
of Lemma 10.6. Observe next that from (G2) and (G3) we get that 


X ‘ < 22 X ‘ 


< n-( c+ -^. 


(97) 


1>T 


1>T 


The first part of claim (iii) then follows from elementary computations, (ii) and 
(97). For the second part, observe that by routine calculations we obtain 


is -s 

1>T 


(00,6) 


|| q + o(s/bJn)) 


+ E Al E max|E[j 7 M J 7 oj]|. 

l>r \h\<b 

Hence the claim follows from (i), (ii) and (97). Claim (iv) can be established 
as follows. Due to Lemma 9.1 we have A j = AJ and ej = ej for 1 < j < t. 
Hence from the representations in (94) and (95), using Cauchy-Schwarz, (ii), 
(G2) and (G3) we get 


S r - S° I \c~Yl X i 1EZ MvhjVoj] I £ P b X i ~ 


P b ■ 


3=1 |h|>6 3=1 

Claim (v) can be established as follows. For the first part, using (i) we get that 

b 




^Ai(max ||i7*™’ b) ||, +o{VVn)) < \/bJn. 

A ' «,7GJN ’ J 

1=1 


For the second part, observe that by the triangle inequality 


I Q -Q c 


< IIS -e 


\g° - g b 


I Q -Q 


Hence the claim follows from (iii) and part one. Claim (vi) can be established 
as follows. Applying Lemma 9.1, Lemma 9.2 and (iv) we get 


\ e j ~ e< j || L 2 $ ||S r - Q*\\ c H>i ^ P M- 


Due to the convexity assumption in (G2), relation (76) in Lemma 7.13 and 
(G2) yield 


< 


mini <j< T ^j ~ A r 


<T 1 +C 'n b <n t ( 1+r 


'P ■ 


(98) 
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Hence for large enough Co > 0, the claim follows. In order to establish (vii), 
observe that by Lemma 9.2, (iii) and proceeding similarly as in (98) we get that 
for large enough t > 0 

IK - £ \\s b - s°LM- < E 

j>T 

<n-( c+_ 1)i+1/2+c ~\j+ <n~ 2 \j+, 

uniformly for 1 < j < Jf . For the second part, we can proceed in the same way. 
Claim (viii) can be established as follows. Note that by Lemma 9.1, Lemma 9.2 
and (iv) 


max |A 7 - — A®1 < \\Q T — Q°\\ r < p b . 

1 <j<r' J 11 

On the other hand, we get from (G2) that At > n~ c l . Hence we conclude that 
for large enough Co > 0 

min A, = A r > 2 max I A,- — A*|. 

1 <j<T J 1 <3<t' j j 1 


Since Aj/A* = Xj — A?)/A* + 1 (and similarly for A*/A j), the claim follows. For 
the second part, using a 2 — b 2 = (a— b)(a + b) and Cauchy-Schwarz, we get that 

| A*-A, | = |E[(X fc , e p 2 ] -E[(X k , ej ) 2 }\ 

— ||(**» e i — e i) II2 II e j + e J) 112 

<||el-e,|| L2 ||||X fe || L2 || 2 ((Ap 1 / 2 + (A J ) 1/2 ). 

An application of Lemma 9.1, Lemma 9.2 and (iv) then yields 

IK-KIEI 0 r -0ll £ M<KM- 

Using (98) we conclude that for Co > 0 sufficiently large 

min A = A r > 2|A?- Ay |, 

1<J<T J i j i 

and thus one readily deduces the claim. 

□ 

We are now ready to actually proceed with (Step 1). 


Lemma 9.4. Grant Assumption 4-4- Then for sufficiently large t, Co > 0 we 
have 
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Proof of Lemma 9-4- By construction in (95), we can use Mercer’s Theorem to 
obtain the desired decomposition. The bound for ^2]= i follows from Lemma 
9.3 (viii) and (G3). 

□ 

The following next three lemmas establish the validity of (Gl) b , (G2) 6 , and 
(G3) b . 

Lemma 9.5. Grant Assumption 4-4- Then for 1 < q < p* and sufficiently large 
t, Cq > 0 we have 

(l) maxi<^< r ||77^ -rjlj J \\ q < n \ 

(ii) maxi< J -< r ||X;fc = i( r ?fe,j - Vk,j)\\ 2 g < n ~ 1 - 
Hence (Gl) b holds for {rff j}i<i j< T and {Vkj}i<k<n i<j<r due to Lemma 9.3 

(i)- 

Proof of Lemma 9.5. We first show (i). Note that it suffices to uniformly control 
the distance between rff. i rff._ h 3 and ri k i r] k _ h ■. To this end, observe that we have 
the decomposition 


( X k > e °j){Xl-h > off) — (Xf , e* — ej ) (Xf_ h , ef) + (Xf , ej ) (X k _ h , ef — ef) 

+ ( X k> e< j ~ e j){Xk-hi e i ~ e i) + ( X k> e j)( X k-h’ e i)- (99) 

We will deal with the error terms separately. Recall that X = X — E[X]. 
Applying Fubini-Tonclli, we get that 


{Xl,e« i -e j )(Xl_ h ,e* i -e i ) = 



Using Cauchy-Schwarz two times, we thus obtain from the above 

b 


E E 

6,-1 k—h -\-1 


(XZ,e3 - ej )(Xj;_ h , e « - ei ) 


n — h 


< 


° n ~Y T ~Y T 

E E 


h=1 k=h +1 


n — h 


L 2 xL 2 


* \\ e i e » ¥2 e i e 7 I 


Il 2 II j v?IIl 2 ’ 


where ||/||l 2 x l 2 = /t 2 f( u i v )dudv. Elementary calculations yield that 


E E 

h—1k=h +1 


x l x l- h 


, — h 


L 2 xL 2 


< J2 AiA tH 


(oo,6,2) ii2 


q i,j =1 


Since Y^JLi A j — A i < °o by (G2), (35) and (G3), we obtain from 

Lemma 9.3 (i) that 


E E 

h= 1 k=h-\-l 


VT VT 

A fc ^k-h 


,-h 


L 2 xL 2 


<-E a E<- 


n 

Q i,j =1 


n 


( 100 ) 
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Observe that by Lemma 9.3 (ii) max^giN tpjj < oo. Due to (35) and Lemma 
9.3 (viii) we conclude max 1 <j< T Aj/Af < 2 maxi<j< T (fijj < oo. From (G2) 
we thus obtain that maxi<j< T (A*) 1 / 2 < r c / 2 . Due to Lemma 9.3 (vi), and 
piecing all bounds together, we get that for sufficiently large Co > 0 


max (A?Ap 1/2 


E E 

h=l k=h -\-1 


( X L e< j - e j)( X k-h’ e i - e i) 


n — h 


< n- 1 . 


( 101 ) 


Arguing in the same manner, one also obtains 

b • 

'l —1/2 


max (A*A*) 1/2 

l< 2 ?< 7 <r 


E E ■ 

h— 1 /c—/i+l 


e i) (Xf h i ei) 


n 


— h 


( 102 ) 


and the same bound also applies to (X]^,ej}(Xf_ h ,e^ — ef). By virtue of the 
decomposition in (99), the triangle inequality and (101), (102), we conclude 
that 


max 

1<M<T 


Vi 


_(oo ,b) 


< 
q ~ 


-1 


which establishes (i). In order to show (ii), we can proceed in the same way. 
The only significant difference is that one needs to use Lemma 10.3 instead of 
Lemma 9.3 (i). □ 

Lemma 9.6. Grant Assumption f.f. Then for sufficiently large t, Cq > 0, con¬ 
dition (G2) & holds for {A®}jg]N with the same a, J+, uniformly in n,b. 

Proof of Lemma 9.6. From the triangle inequality, Lemma 9.1, Lemma 9.2 and 
Lemma 9.3 (iv) we get that for 1 < i,j < t. 

|A* — A®| > |Ai — Ay| — 2\\g° - g T \\ c 

= |A j -A i | ~0(p b ). (103) 

Due to the convexity assumption in (G2), relation (76) in Lemma 7.13 and 
(G2) yield that for i > j 


|Ai -Xj\> |Aj+i - Xj\> A j/j > r c ~-y (104) 

Combining (103) and (104), it follows that for large enough Co > 0 

| Af — Af | > | A» — A j | (l — 0(p b )), uniformly for 1 < i, j < r. (105) 
Using (105) and Lemma 9.3 (viii), we thus obtain uniformly for 1 < j < J+ < r 


max 
i <j<Jt 


E 

i—l 


A? 


|A? - At 


<2(1 + 0(p b )) 


max > 




\ 

|Aj — AJ ’ 

















M. Jirak/Eigen expansions and uniform bounds 


43 


This together with Lemma 7.13 and condition J+ < n 1//2 “(logn) 2 yields that 


(n/ logn) 


-1/2+0 


A° 

max > ,, ^ V < 00 . 

IAJ - A-| 


(106) 


In the same manner, one establishes 


(n/ logn) 


-l+2o 


.E- 


A?A? 


( A l" A i ) 2 


< 00 . 


(107) 


Combining (106), (107) with the fact that Aj+ > n c / 2 by (G2) finishes the 
proof. 

□ 


Lemma 9.7. Grant Assumption 4-4- Then for sufficiently large t, Co > 0, 
(G3) b holds for fjj, uniformly in n, b. 

Proof of Lemma 9.7. Arguing similarly as in the proof of Lemma 9.3 (viii), it 
follows that 


E 

|/t|<b 


E[{X k ,e^){X k - h ,e^)\ 


= <?(( A|An 1/2 )- 


Using Lemma 9.3 (viii), we thus conclude from (G3) by routine calculations 
(fijj > <Pj,j/2 + 0 ( 1 ) > l/(4C e ), uniformly for 1 < j < r. 

Similarly, using in addition Lemma 9.3 (ii) yields 

V°j,3 = Vj,j + °(l) ^ E \ K [ r lk,j' r lk-h,j\ | + o(l) < 00 , 

|/ i |<6 


which completes the proof. 

□ 

We are now ready to proceed to (Step 2). To this end, we need the following 
preliminary result. 

Lemma 9.8. Grant Assumption 4-4- Then for 1 < q < p* and sufficiently large 
t, Cq > 0, we have 

(i) |[4°°’ 6) || ? < \JXiXj (b/n), uniformly for i,j e IN, 

(ii) maxi^^HELi l4j’ 6) - / fcj| 2 || g/2 - « _2(c+_1) S 

(iii) “axi<i<r||Efc=i l / fc,j| 2 || g/2 ~ (V«) + n - (c+ - 1)t . 
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Proof of Lemma 9.8. We first show (i). Using representation (32) with b = oo 
(a different b), elementary calculations give 


r(°°,i>) I 


;$ v A *M K 


*(°°>&)|| 
i,j Ik ' 


max — 
ije in n z 


k=l 


Observe that by (G3) we get that min ?e K ipjj > 1/C g . Due to (35) we conclude 
maxj g ]N Xj/Xj < C . From (Gl), using Lemma 10.3 and Lemma 9.3 (i), claim 
(i) follows. Next, observe that by the triangle inequality, Cauchy-Schwarz and 
Lemma 9.1, for 1 < j, k < r 

14fe’ 6) - !j,k\ < \({Q b - Q > )ieiUk)\+\({G b - e°)(e 3 ),e fe )| 


I q -g c 


max ||ej — ej || L2 . 


1 <j<T 


Since Yfj=\{ x ’ e i) 2 — IMIl 2 ’ inequality (a + 6 + c) 2 < 3(a 2 + 6 2 + c 2 ), Lemma 
9.3 (v), (vi) and the triangle inequality then yield 


E l r (00 ,b) ,0 


i 2 ii g/ 2 <mia-rmi 


9/2 


\G b -G 0 \\ r +T 2 p 2 


k =1 


Hence selecting Co > 0 sufficiently large, claim (ii) follows from Lemma 9.3 
(iii). Claim (iii) follows from (a + b ) 2 < 2(a 2 + & 2 ) and (i), (ii). 


□ 


We can now complete (Step 2): 


Proof of Theorem f.6. By virtue of Theorem 4.2 and Lemmas 9.4, 9.5, 9.6 and 
9.7, it suffices to show that the error of replacing all quantities in Theorem 4.6 
with their o-analogues is negligible. More precisely, 

" max 7+ ^ - i fr ) \/^w P ~ 11^5+1- x °j - ii P + err ° r - 


i <j<Ji 


(108) 


We will do so in the sequel. Observe first that by Lemma 9.1 and Lemma 9.2 
we have 


|(A$ - A j) - (Xj - A^)| < |A$ - Ap + \ \j - Ap < ||S - g 
Hence by Lemma 9.3 (iii) we get that (recall p* = p2 p+4 ) 


\g b - g° 


\c 


\J n/b\ 


max 


(X b - A ) - (W - Ap|/Aj|| < VWbn-^-^/Xj 


Due to condition (G2) and Jff < n 1 ' 2 , we get 
n/bn~^ 


! x j+ S n 


^tl-(c+-l)t < -1 
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for t sufficiently large. We thus conclude 


y/njb 


i< n <J+(|t Ajb ~ ^ 


(109) 


An application of Lemma 9.8 (ii) yields that for sufficiently large t > 0 and 

C 0 >0 


VWb\\ max , - -^IMilL ~ n 1 - 

i <j<Ji JJ p 

Combining (109) and (110) and using Lemma 9.3 (viii), we arrive at 


( 110 ) 


II max , - I jj OC) l/ A i IL ~ II max + I A j - A i - T h I A? |L + 1 /{ns/njb) 

i <j<Ji p i <j<jZ p 

< || max |A* - A® — ^ 71 / A TIL + l/{n^/njb). 
i <j<Ji 


□ 

Proof of Theorem f.l. Proceeding as in the proof of Theorem 4.6, based on 
Lemmas 9.5, 9.6 and 9.7, it suffices to show that the error of replacing all 
expressions by their corresponding o-analogues is bounded by ro -2 , uniformly 
for 1 < j < <7+. To this end, note first that due to the convexity assumption in 
(G2), Lemma 7.13 yields that uniformly for k,j £ IN 

(j V k)\Xj — Afc| > (Ay V Xk)\j — k\. (Ill) 


We will make frequent use of this lower bound in the sequel. We first consider 
the expansion of e* — ej. To this end, we establish preliminary bounds regarding 

Ilj ■ For 2J+ < r, using Lemma 7.12 and the triangle inequality we get 


max 
i <j<Ji 



(fffff 


(V - A,) 2 



WC’E 

k>r 


max 
1 <i<Jt 


1 ll(4“’ l> )\ 

A, (A* - A,) 2 • 


Since 2 p < p*, 2J+ < r, Lemma 9.8 (i), (111) and (G2) yield the upper bound 

(■foA E 


1 k 2 Xj\ k , + ,i/ P bn (c 1)4 1 1 

max — , ' —-y^(d„) --- max —. 

11 1 ' i<j<jf Aj Xj {k j) Xj+ i<j<j+ Ay 


k>r 


Since Ay > Ay/Ay_i > Ay A 1 and Ay > j c by (G2) and J+ < n 1 / 2 J we 
conclude from the above that for sufficiently large t > 0 we have 


max 
i <o<Ji 


1 

a; 


E 

k>r 


(faff 


(A k Ay ) 2 


< 


2.7+ < r. 


p 


( 112 ) 
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Arguing in a similar manner, we get that for sufficiently large t > 0 


max 
1 <j<Jn 

-2(c+ —l)t+2i 


5, n 


feTi ^ ~ Ak > 

k^j 

+1/P <n~ 2 . 


((J+) 2/p 

1 W 


EI J £ ,b) 


k=l 

k^j 



(113) 


By related arguments and Lemma 9.3 (vi), Lemma 9.8 (iii) , we get that for 
sufficiently large Co > 0 


1 

{ e k e k)Ik,j 



1 <3<Ji A j 

h ~ Xk 

L 2 

P ~i <j<k\ A Jn+ Ai h\ k ~h 


k^j kytj 


(114) 


Combining (112), (113) and (114) and using 11ej11 ^2 = 1 we obtain via the 
triangle inequality 


1 

max — 

i <i<Jt A j 


E efc 


fc=l 


k^j 


r(oo ,b) 

1 kj 



E 

fc=i 

k^j 


T° 

o 1 k,j 
e k 


Xj Afc 


L 2 


< 1 . 

~ rc 2 


(115) 


Now, using (a + b) 2 < 2(a 2 + 6 2 ), ||e*|| L 2, ||e?|| L 2, ||e||| L 2, || ej\\^ = 1, Lemma 9.3 
(vi), (vii) and (115), the triangle inequality gives for sufficiently large t, Cq > 0 


max 



■J ii 


2 

L 2 ~ 


E 

k=l 

k^j 


&k 


.( 00 , 6 ) 

2 k,j 

Xj Xk 


L 2 


V 


i 

pf*. 

£<> 

1 ?* P 0 || 2 

T fO 

r o 


l<J<Jn A 1 

■? J 2 

1 E/11 

tr x ^ Xk 

L 2 


k =1 
k^j 


(116) 


Moreover, using the same arguments as in the proof of Lemma 9.6, it follows 
that 


A , = E 


Xj Xk 


k =1 
k^j 


{Xj - x k y 


^E 

k =1 
k^j 


Xj Xk 


{Xj - Afc) 2 


> 


(1 - o{p b )) 


E 

fc=i 

Ml 


A® A? 


(x°-xty 


d = A 0 


(117) 

and this holds uniformly for 1 < j < J+ (we exclude 0{p b ) in the above 
definition of A*). Similarly, using also Lemma 9.8 (iii) in addition, it follows 
that 


max 

l<7<Af 


1 

A 1 




<p\ 


P 


(118) 
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for sufficiently large Co > 0. Using first (118) and then (117), it follows that 
(116) is further bounded by 



1 


max 
1 <o<J 





1113-31 


-£■ 
k =1 
k^j 


T° 

J k,j 






(119) 


This completes the proof for the expansion of ej — e-p The treatment of the 
expansion ||e* — e 7 ||^ 2 only requires minor adaption of the previous arguments, 
we omit the details. 

□ 


Proof of Proposition 4-8- This follows from Lemma 9.3 (i) and analogue com¬ 
putations as in the proof of Theorem 4.2. 

□ 


Proof of Corollary 4-9- This follows from Proposition 4.8 and Lemma 9.3 (i). 

□ 


10. Proofs of Section 5 

We need to introduce some further notation. To this end, we slightly reformulate 
our notion of weak dependence in an equivalent way. In the sequel, { e fc} fc6Z € S> 
denotes an IID sequence in some measure space S and T k = j < £;) the 
corresponding filtration. For d £ IN, we then consider the variables 

U k .h = H } i(J'fc), k G Z, 1 < h < d, 

where H k are measurable functions. Note that by considering different measure 
spaces S>, we can virtually model any spatial dependence structure we want, 
with the extreme cases where U k ,h = U k ,h+ 1 or U k ,h and U k ,h+ 1 are indepen¬ 
dent. Compared to Section 5, this setup is notationally more convenient, and 
prevents us from the necessity of considering different sequences {efc,h} fc6Z for 
each coordinate h. As a measure of dependence, we then consider 

°j,P = max '\\U j>h - U' jth \\ p , p> 1, 

where U k ,h = H h {P' k ), T' k = <r(... e_i, e(,, ei,.. .,e fc ), and {4} fceZ is an im¬ 
pendent copy of {efe} fcgZ - 

10.1. Gaussian approximation for weak dependence 

In this section, a high dimensional Gaussian approximation result is established, 
which is a key ingredient in the proof of Theorem 5.2. This result may be of 
independent interest. Let S„ t h = ]Cfc=i U k ,h.-, and denote with 

T d = ~^= max I S n h I, Tj = max \Z h \, (120) 

sjn l<h<d' ’ 1 l<h<d' 1 
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where {Zh} 1<h<d is a sequence of zero mean Gaussian random variables. We 
also formally introduce 

71, .7 — lim IS \S n ,i ,j 1 i 

n—>oo n 

existence is shown below in Lemma 10.6. We also put a\ = 7 hh- Throughout 
this section, we work under the following assumption. 

Assumption 10.1. The sequence {Uk,h} keZ is stationary for each 1 < h < d, 
such that for p > 2 and d < n D 

(FI) E[C/fc,/!.] = 0 and 6j tP < j~ c with c > 3/2, 

(F2) h< p/2-1, 

(F3) inf h a h > 0. 

We then have the following Gaussian approximation result. 

Theorem 10.2. Grant Assumption 10.1. Then 

sup I-P(id < x) - P(T ( f < x) I < n~ c , C > 0, 

where {Zh} 1<h<d has the same covariance structure as n {Sn,h} 1<h<d - Al¬ 
ternatively, we may also choose ( 7 »,j) 1<i .< d as covariance structure. 

We first establish some additional notation. Let K = n { , L = n l such that 
n = KL and 0 < 6 , l < 1. To simplify the discussion, we always assume that 
K, L £ IN. For each 1 < l < L, let {ei} fc6Z £ S be mutually independent 
sequences of IID random variables. For K{1 — 1) < k < Kl , 1 < l < L, denote 
with 

U ( k K n 0) = where Pff = e K{l _ 1)+1 , e X (i-i)+ 2 , ■•■,£/=), 

where T l k = cr (e^- j < k). For 1 < m < K put 

K(l — l)+m—1 Kl 

V l \{m)= Y, U ^+ E ( 121 ) 

k=K(l- 1) + 1 k=K(l-l)+m 

and V t \ = VJ^(l). The random variables V° h play a key role in the proof of 
Theorem 10.2. Note in particular that {Vfh} 1<1<L is HD by construction for 

each h. Finally, put S L , h (V) = Ya=i v i,h and Sl h (V) = J2?=i v i% and note 
that S n} h = SL,h{V). In the sequel, we make frequent use of the following lemma. 

Lemma 10.3. Suppose that ®j,p < 00 f or P > 2. Then 
max \\Ui,h + ■ ■ ■ + U n .fell < Vn- 

l<h<d 11 "P 
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For the proof and variants of this result, see [63]. The next lemma controls 
the approximation error between SL,h{V) and ££ fe (V). 

Lemma 10.4. Grant Assumption 10.1. For any K = with 0 < t < 1 there 
exists a S > 0 and a constant C > 0 such that 

P{\S L ,h(V ) - Sl th {V )I > Cn 1 ' 2 - 5 ) < n- P -^ +p5 . 

Proof of Lemma 10. f. Let x n = X\/n, x > 0. For 1 < m < K we have that 


P(\SlAV) ~ Sl h (V)\ >2x n ) < p( 

L K(l— l)+m—1 

E E v™ - <■“' 

\ 

1=1 k=K(l-l)+l 



+ P 


T, V hh-Vfh(m) 


1=1 



Denote with a^ p = ( j p I 2 1 0 p p ) 1 ^ p+1 ' > and A = YlpLi a j,p- Note that by (FI) 
we have 


a j>p < j <8 ^ ,c \ where Q3(p, c) 


p(c — 1/2) + 1 

p+1 


> 1, 


( 122 ) 


and thus A < oo. Due to Theorem 2 in [50], there exist constants C P) i , Cp^ > 0 
such that 


P 


L K(l — l)+m—1 

E E ^ 


i=i fc=if(i-i)+i 



< 


Ci^ p Lm 


x 


p 

n 



+ exp 


Cp,2X 2 n \ 

Lm\\U k , h \\l)- 


C PJ a jp X l \ 
A 2 Lm0 2 2 ) 


Setting x = y\/L mA 1+1 / p jy/n, it follows that a 2 p x^/(A 2 L md 2 2 ) > j 1 2 / p y 2 
and hence 


“K-lJSi) £exp (- c ^ 2/v )' 


l-'’p.‘2 ( X.j v X n 

Choosing m such that \Jnj\/Lm = n 2S and y = n s , S > 0, it follows that 


P 


L K(l— l)+m— 1 


E E U k,h ~ U j 


_ (K,o) 

k,h U /j, 


> n 1/2 ~ s A 1+ ^A < n~ E ^ +pS . (123) 




1=1 k=K(l- 1 ) + 1 

Next, put A k,h{U) = Uk,h — Ukh^ • By the triangle inequality, we have 
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Let (fc)if = k mod K. Then Theorem 1 in [62] yields that 

OO 

i^JI A ^ (t7) C = b-uiTX* E e lv = ^)K, P - 


"P l<h<d 


j=(k) K 


Since clearly @( k ) K ,p is monotone decreasing, we have &(k) K ,p < ®(m) K ,p for 
m < k < K. Combining this with the above, it follows that for m < (k)x (since 
m = ( m ) K ) 

ma,xJ\A k ' h (U) - A kth (U)'\\ p < 2 (d kyP A '= flk , P (m). (124) 


Put /3 j, P {m) = ( j p / 2 1 $? p (m)) 1 ^ p+1 ^ and B(m) = /3j tP (m). Then another 

application of Theorem 2 in [50] yields that 


P 


i=i 


- x n ) < Ci,p^r +E exp ( “ 

1 Xn ' 


3 =1 


Cp,2Pl p {jn)x 2 n 

P 2 (m)m? 2 2 ( m ) 


+ exp - 


Cp,2X n 


nm&x k > m \\Ak,h(U)\\U ’ 

Let y n = n s \/Lm/y/n = n~ 5 . Arguing similarly as before, it follows (since 
m = (m) K ) 


P 


E - Kh( m ) >Xn)<—+Y^ eX P( - 

X n j=l 


1=1 


C 9 7 1 + _2 / p l/ 2 
^p,2J _i/n 

B(m) 2 


+ exp - 


Cpgvl 

0m.» 


Since 0 m?p < nn 2c+1 , we conclude 


M 


B(m) < ^ ctj tP + 5^(j p ' /2_1 m _pc+p,/2 ) 1/Cl)+1) < + 

j>M .7=1 

Setting m ~ w?, v > 0, balancing the above and choosing <5 sufficiently small, 
we obtain 


Vn 


B(r 


A 




n . 


(125) 


'm,p 


This implies that 


P 


ex*- 


;=i 


> n 


1/2-5^1+1/jA < 




Note that by the above choice of m = n v we require that L ~ n i-4<5-iy Choosing 
z/ sufficiently close to 1, we can select 6 < 1 arbitrarily close to 1, which completes 
the proof. 

□ 
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In the sequel, we also require the following result. 
Lemma 10.5. Grant Assumption 10.1. Then 


P 


V° 

V l,h 


> \[K log < K 1 ~ p/2 (logn) p . 


Proof of Lemma 10.5. Since V° h = Theorem 2 in [50] and arguing similarly 
as in Lemma 10.4 yields 


P 




C P ,2j 1+ ~ 2/p y 2 \ 

A 2 J 


Setting y = log n, the claim follows. 

□ 

Next, we establish some useful results concerning the covariances <pk,i,j = 
^[Uo,iUkj] ■ 

Lemma 10.6. Grant Assumption 10.1. Then 

(i) sup i;j \</>k,ij \ - fc _c+1/2 , 

(ii) sup i>;j \<l>k,i,j\ < oo, 

(iii) r )i^j = 00 ,i,j H"~ 2 i ^ oo? 

( iv ) Efc,i=i ^[Uk,iUi tj ] = njij - n A \ k \fool¬ 

proof of Lemma 10.6. Claims (iii) and (iv) are well-known in the literature, 
and follow from elementary computations from (ii).Since (i) implies (ii) due to 
c > 3/2, it suffices to establish (i). To this end, let Uf h = where 

Fk =<*{■■■■> e -D e o> e i> • ■ • > £fc)- Since then E[U^ h \P 0 ] = ^[U k ,h\ = 0, Cauchy- 
Schwarz and Jensens inequality yield 

\K[U 0 , t U k . t ]\ = |E[!7o, i E[t4 J |.Fo]]| < \\U 0 ,i\\ 2 \\U ktj - U£j r 

Theorem 1 in [62] and (FI) then imply that 

| E[U 0 ,iU ktj \ | < < k ~ c+1/2 - 


□ 


For 1 < i, j < d denote with 

-E[S nii S nij ] 


An) _ 1 ’ 


= ]f[ so U v ) s lA v )]- 
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Remark 10.7. Note that Lemma 10.6 (iv) yields that 


| 7 m - 7g } | < \ E fc3/2_c + E k ~ c+1/2 £ ™ 3/2_c - 

/c=l /c>n 


Lemma 10.8. Grant Assumption 10.1. Then 

I (ra) (o,n) I ^ —1/2 r 

max ' < n ' L. 

Remark 10.9. Note that we obtain from Remark 10.7 that 

I (o.n) I ^ -1 T , 3_ 

|7*,j - 7,b/ | <; n 2 L + n 2 . 

Proof of Lemma 10.8. We have that 


E[SlAV)SlAV)] -KSUWSIjW} < 

Z=1 

L 

+ EII^-^ll 2 ||^,i(^)ll 2 - 


;=i 


By the Marcinkiewicz-Zygmund inequality, Lemma 10.3 and (FI) we have 

max \\Sl iTV’)|| < \fn and max llfix / l (V)|L < yfn. (126) 

1 <h<d" ’ nz l<h<d ’ z 

Using the triangle inequality and Theorem 1 in [62], it follows that 


max 

Kh<d 


Ell^-^L ^ max LY\\U k , h -U* k 

' M ’ ’ 112 l<h<d ' M ’ 


1=1 


h \\2 


k= 1 


i L T.JT. e hi L T.r ,+l, 2 <L. 

k= 1 V j>k 


(127) 


fc=i 


Hence combining (126) and (127) we obtain 


| In) ( o,n ) | ^ —1/2 r 

max j 7 ,l J - / - 7,1 j- | < n 1 L. 




□ 


Next, we state some Gaussian approximation results. To this end, we require 
the following condition. For e, u(e) > 0 we have 

max max |V^| > ^Ku(e) \ < e. (128) 

\1 <h<d 1<1<L J 
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Denote with 


rpO 

1 L,d 


—= max 
yn l <h<d' 


1 L,h 


(V)| 


rj-\Z ,0 _ 

-L A - 


= max 
l<h<d 


7 ° I 


where {%£} i<h<d is a zero mean Gaussian sequence with covariance structure 
j-,(o,ra) _ . We have the following Gaussian approximation result, 

which is an adaptation of Theorem 2.2 in [19]. 

Lemma 10.10. Assume the validity of (128) and that 

(i) IC -1 / 2 mini<ft,<d mini<j<£ || V£ h || 2 >0, 

(ii) K~ 1 / 2 maxi<ft,<d maxi<;<£ || V^)J| 4 < oo. 

Then it holds that 


sup\P(Tl d <x)-P(TZ <x)\ 


< 


L 1 / 8 (log (dL/e)) +L 1 / 2 (log (dL/e))^ u(e)+e. 

We also require the following two results, which are Lemmas 2.1 and 3.1 
in [19], slightly adapted for our purpose. 


Lemma 10.11. Let {Xh} ^ 


and 


be zero mean Gaussian se¬ 


quences, and denote with TijiTij the corresponding covariances for 1 <i,j<d. 

If 0 < inf h Th, h < su Ph 7 h,h < °°> then 


sup|P( max \Xh\ < x ) — P/maxJYfcl < x) | < J 1 / 3 (l V log(d/<5)) 


2/3 


xGIR. 


l<h<d 


where S = max 1 < i j< d \'yX - 7 //. 

Lemma 10.12. Let {Xh} l<h<d be a zero mean Gaussian sequence, and denote 
with 7 /j the corresponding covariances for 1 < i,j < d. If 0 < inf hThh — 
sup hlh.h < 00 > then 


supP( max \Xh — <5| < a;) < <5 7 /1 V \og(d/8). 
xeR ±<h<d 

We are now ready to give the proof of Theorem 10.2. 

Proof of Theorem 10.2. First note that by Lemma 10.4 and Booles inequality 
we have 

P(maxJS L>h (V) - Sl >h (V)\ > C 1 n 1 / 2 ~ s ) < dn^***. 

Since d < n® we obtain from (F2) that 

P(max | S L , h (V) - Sl h (V) \ > Cm 1 / 2 " 5 ) < n~ c *, C 2 > 0. (129) 
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Employing this bound, we get that 

P(T d <x)< P{T 0 L d <x + C in ~ 5 ) + 0(n ~ G2 ). 

In the same manner one obtains a lower bound, hence 

P(Tld < x ~ C in - S ) - 0(n ~ c2 ) < P(T d < x) 

< P(Tl d <x + Cin~ 5 ) + 0(n~ C2 ). (130) 

Next, we apply Lemma 10.10 to Tj d: To this end, we need to verify its condi¬ 
tions. Note that by the independence of Vf h , we have that 


(o,n) 

Th,h 


1 L 

— VI 

LK ^ 

i=i 


V, 


l,h\ 


1 

K 


V, 


1 ,h 


2 

2 ' 


Hence we deduce from Lemma 10.6, Lemma 10.8, Remark 10.9 and (F3) that 


K 


-l 


\V^ h \\ 2 2 > ^1-0(1) >a 2 h -o(l) >0, 


uniformly in h 7 and thus (i) holds. Next we verify (ii). This, however, readily 
follows from Lemma 7.12 and (FI). Finally, we need to establish (128). Set 
u(e) = (logn) 2 . Using Booles inequality and Lemma 10.5 gives 


P[ max max \V£ h \ > sj Ku{e) 

d L 

<J2J2 P (\Kh\ > VK^)) < dLK- P -^(logny. 

h=1 1=1 

By (F2) and choosing t sufficiently close to 1, we get that 


p max max \V£ h \ > yjKu{e) ) < n ° 3 , C 3 , 

\l<h<d 1<1<L J 

and (128) holds with e ~ n“ C3 . Since L ~ n [ with 1 > 0 due to 6 < 1, Lemma 
10.10 yields that 

sup I P(Tl d < x) - P(T f < x) I < n~ Gi , C 4 > 0. (131) 

Combining this with (130), we deduce that 

P(Z° d <x- Cm~ s ) - 0{n~ c s) < P(T d < x) 

<P(Z^<x + Cin~ 5 )+0(n~ C5 ). (132) 

Next, since logd < logn, Lemma 10.12 yields that 

sup I P(Z d <x — C\n~ 5 ) — P(Z d < x) I < n _<5 \/logn. (133) 

xGR 
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In addition, by Remark 10.9 

max M*’"* — 7j,•1 < n~^L + n^~ c < n~ Cf> , Cq > 0. 
l<i,j<d ,J 

Hence an application of Lemma 10.11 yields 

sup|p(^<x) -P(Z d <x)\ <n~ c , C> 0. (134) 

□ 


10.2. Proofs of Section 5 


Proof of Theorem 5.2. Denote with 


T J+ 


1 

—= max 

V n i <j<Ji 


E 


n 

k= 1 



a 0,j 



We first show that we may apply Theorem 10.2 to Tj + . To this end, we need 
to verify Assumption 10.1. Observe that (E2) implies |||| < oo (cf. [62]). 
Moreover, using a 2 — b 2 = (a — b) (a + 6), it follows from Cauchy-Schwarz 

| \vl,j - ( vl,jY\\ q < 2 || Vk,j - VkJUqhkdLq ~ n k{ 2 q) <k~ b . 

Since b > 3/2 by (E2), (FI) follows. Next, note that (El) implies that J+ < 
n p( a ~< 5). Since q/2 — 1 > p2 t>+2 > pa (recall 0 < a < 1), (F2) holds. Finally, 
(E3) gives (F3), hence Assumption 10.1 is verified. We proceed with the proof. 
For j £ IN, denote with I*j = Xj Efc=i (vl j ~ l) / n ; and note that by the above 
and Lemma 10.3 we have 

II 7 ;J p ;$(Vn 1/2 )> J'elN. ( 135 ) 

Introduce the set 

M = { max X^lX-Xj - I* ■ | > „-V2-«/2\ 

3 1 3,31 1 


Then Markovs inequality together with Proposition 2.7 and (135) yields 
P(M C ) < n~ pS/2 < n~ Cl , Ci > 0. 

Due to Theorem 10.2 and the above, we have the inequalities 
P(Tj+ <x)< P(Tj + <x + n~ & / 2 ) + P(M C ) 

<P(Tf+ <x + n~ 6/2 )+0(n~ C2 ), C 2 > 0, 


( 136 ) 
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where Tf + is as in (41). An application of Lemma 10.12 yields that this is further 

J n 

bounded by 

P(Tj+ < x) < P(Tj + < x) + 0(n~ C2 + n~ s/2 logra). 

In the same manner, we obtain a lower bound, hence 

sup|P(T + <x)~ P(tY < x) I < n~ c \ C 3 > 0, (137) 

l£R Jn Jn 

which completes the proof. 

□ 

Proof of Corollary 5.3. Due to Theorem 5.2, it suffices to show that 
p ( T j+ < u j+i z )) exp(-e~~). 

This, however, follows from Theorem 14 and Theorem 1 in [33]. 

□ 


11. Proofs of Section 6 


Proof of Proposition 6.1. Due to (46), Theorem 3.6 in [11] yields the Bernoulli- 
shift representation Xk = YtLo & z ( e k-i)- Next, using the orthogonality of 
we get 


IK^>ll* = £^,e?> 2 . (138) 

i=i 

On the other hand, since e*, and Xk-i are independent, we obtain 

A? = ||(AT fc ,e?)|| \ = ||<*(Jf fc _ 1 ),c?)]|; + \\(ek,e<!)\\l > ||<e fc ,e?>||*. (139) 

For k > 1, using the triangle inequality, the linearity of 3?, the fact that 3>(e^) = 
and (48) yields that 


Af|| 4,i - (vM < (£(Af) fc |(ef,ef)|||(e 0 - e' 0 ,ef)\ 


< 


< 


/ 0 \|| 9'/9 


£(Af) fc Kef,ef)|||(eo-e',ef) 


£(Af) fc (£A*.||e 0 , i || ^ lq ^A)\4A? 
*=i S=i 


1 /2\ 2 


where we also used (YJLi ^j( e j’ e t ) 2 ) ^ < oo in the last step (recall 

q' > q). Note that we have the inequality 


( e P e f) 2 ( e t e f> 2 < ( e p e f) 2 


( 140 ) 
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which can be readily derived by contradiction (assume the converse and sum 
over j on both sides). Hence by the triangle inequality and (46), the above is 
further bounded by 


/ oo / oo \ l/2\ 2 

< (^ X t) k [T, X ^ojl q ' /q (e e r e^) ) 

1 \= 1 ' ' 

oo \ 2k oo oo 

E^) E A lKlll2 9 ' /9 (e-, e f) 2 <p fe E A 5< e i’ e f) 2 ; 

*=1 ' i=i .7=1 


for 0 < p < 1. Combining this with (138), (139) we arrive at 


I vh - (vin q <%E A i<4 e ?> 2 £ k > i- 

j =l 


(141) 


If k = 0, we get from (48) that 


Af|K, - «z)1| 2 = ||(e* - 4,ef)||; <E A ^ll Cfc .ill«<^> e ?) 2 - ( 142 ) 

l=i 


If fc < 0 we have rf k ^ = ( 77 ® ■)', and hence the claim follows from (141) and 
(142). Observe that by telescoping and Kolmogorov’s zero one law, we also get 
that maxjgn || 9 < °°- 

□ 


Proof of Corollary 6.2. This follows from Lemma 10.3. 


□ 
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